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PREFACE 


Increasing  complexities  and  challenges  of 
modern  systems  development  have  set  forth  equally 
new  and  urgent  comolexities  and  challenges  relative 
to  advancements  in  management  of  computer  software 
life  cycles. 

As  an  agency  of  the  United  States  Army 
Computer  Systems  Comnand,  the  Army  Institute  for 
Research  in  Management  Information  and  Computer 
Sciences  (AIRMICS)  was  honored  to  be  able  to  spon¬ 
sor  the  Second  Software  Life  Cycle  Management  Work¬ 
shop.  This  forum  brought  together  some  of  the  most 
notable  contributors  within  the  field  of  software 
life  cycle  management.  The  collective  thoughts  of 
this  prestigious  group  are  reflected  in  these  pro¬ 
ceedings  and  should  significantly  enhance  and  in¬ 
fluence  the  course  of  future  life  cycle  management 
directions. 

My  sincerest  personal  appreciation  is 
extended  to  all  those  who  participated  and  made 
the  Workshop  a  highly  successful  venture  in  tech¬ 
nology  exchange. 


Clarence  Giese 
Oirector,  AIRMICS 
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II.  EXECUTIVE  SUMMARY 
OF  THE 

SECOND  SOFTWARE  LIFE  CYCLE  MANAGEMENT  WORKSHOP 


By  Victor  R.  Basil i,  Edward  H.  Ely  and  Donovan  Young 


The  Second  Software  Life  Cycle  Management 
Workshop  brought  together  AO  Sof tware' Lif e  Cycle 
Management  Technology  researchers  and  90  attendees 
to  discuss  theory,  practice  and  technology  in 
managing  software  over  its  life  cycle. 

Building  on  last  year's  progress  in  out¬ 
lining,  identifying  and  describing  the  phenome¬ 
nology  of  software  development,  this  year's  par¬ 
ticipants  discussed  progress  in  validating,  refin¬ 
ing,  extending  and  exploiting  the  models  and 
metrics  reported  in  Software  Phenomenology  (work¬ 
ing  papers  of  the  Software  Life  Cycle  Management 
Workshop,  August  1977).  The  main  concern  ex¬ 
pressed  by  most  participants  was  to  foster  the 
emergence  of  a  viable  life  cycle  management 
technology  that  could  eventually  allow  accurate 
estimation  and  control  of  time  and  resources 
necessary  to  develop  and  support  software  in  the 
military  environment. 

•  Topics  of  interest  included  (1)  description 
and  understanding  of  various  components  of  the 
life  cycle,  (2)  ways  to  delineate  and  analyze 
relat ionships  among  component  activities,  (3) 
milestones  and  other  tools  to  help  direct,  coor¬ 
dinate,  understand  and  control  research  and 
development  in  software  life  cycle  management, 
and  (4)  development  of  management  tools  using 
the  results  of  life  cycle  management  research  to 
help  plan  and  manage  software  development  pro¬ 
jects. 

The  Workshop  was  divided  into  Measurement 
and  Methodology  areas,  and  each  of  these  areas 
was  subdivided  as  follows: 

1 .  Methodology 

A.  Theory:  Identification  of  life 
cycle  components  and  their  interre¬ 
lationships,  based  on  a  comprehensive 
view  of  the  overall  software  develop¬ 
ment  process.  Chairman,  John  H.  Manley. 

B.  Practice:  Formal  definitions  and 
managment  tools  found  or  expected  to 
be  useful  in  life  cycle  management 
and  control.  Chairman,  Raymond  W. 
Wolverton. 

2.  Measurement 

A.  Predictive  Models:  Models  derived 

*Numbers  in  brackets  ’-efer  to  page  numbers  in  this 
document. 


by  analyzing  the  consequences  of  a  set  oi 
assumptions  about  the  development  process 
calibrated  by  data.  Chairman,  Lawrence 
H.  Putnam. 

B.  Empirical  Models:  Models  based  on 
analysis  of  data  from  past  and  ongoing 
development  projects.  Chairman,  L.  A. 
Beladv . 


Participants  submitted  papers  in  advance, 
summing  up  their  research  and  reporting  new  un¬ 
published  results.  These  papers  appear  in  i  »e 
present  proceedings. 

The  two-day  workshop  began  with  short  pre¬ 
sentations  by  each  of  the  participants  to  stim¬ 
ulate  ideas  and  discussions.  This  was  followed 
by  a  formal  discussion  session  addressing  a  set 
of  questions  (included  herein)  regarding  topics 
of  interest  to  the  Department  of  the  Army.  The 
four  groups  outlined  above  met  concurrently,  each 
discussing  a  subset  of  questions.  The  results 
were  orally  presented  by  the  four  Chairmen  to 
the  Workshop  at  large.  Summarized  versions  of 
workshops  findings  and  recommendations  for  each 
of  the  four  topic  areas  prepared  by  Session 
Chairmen  are  included  at  the  end  of  this  Execu¬ 
tive  Summary. 

A  very  interesting  banquet  address  was  de¬ 
livered  by  Mr.  E.  Larry  Dreeman,  Chairman  of  the 
National  Security  Team,  Federal  ADP  Reorganiza¬ 
tion  Study,  the  Presidential  Reorganization  Pro¬ 
ject.  Mr.  Dreeman' s  remarks  centered  on  the  pre¬ 
liminary  findings  of  the  study  team's  efforts  and 
clearly  outlined  the  potential  sotfware  and  re¬ 
lated  system  management  challenges  facing  Depart¬ 
ment  of  Defense. 

This  Workshop  attracted  substantat ive  papers 
from  several  of  the  most  widely-cited  investiga¬ 
tors  in  software  management.  Taken  as  a  whole 
the  papers  demonstrate  rapid  progress  in  software 
management  science,  especially  in  the  field  of 
software  cost  modeling  and  prediction  (Boehm  and 
Wolverton  [  129 1,E1  shof  f  {172] ,  Halstead  [  1 74  ]  ,Nauman 
and  Davis  (63], Parr  [ 66] ,  Thibodeau  and  Dodson  [70], 
Stone  and  Coleman  [93],  Vele2  [ 95 ],  Putnam  [105] ,  and 
Tausworthe  [156]*  Less  rapid  but  significant  pro¬ 
gress  is  shown  in  the  field  of  software  mainten¬ 
ance  and  reliability  estimation  and  control  (Cur- 
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tis  et  al  [166] ,  Coel  [133] ,  Lit tlewood  [  146] ,  Musa  [153] 
Miyamoto  [19  5]  and  Sukert  [209]).  Some  progress  is 
evident  in  formal  monitoring  procedures  for  soft¬ 
ware  management  (Dickover  [52] , and  Basili  and  Zel- 
kowitz  [162])  and  in  automation  of  estimation  tasks 
(Shick  and  Lin [81]),  both  of  which  were  areas  in 
which  many  participants  expressed  intense  in¬ 
terest  and  hopes  for  further  results. 

The  main  conclusion  that  can  be  drawn  from 
the  collection  of  papers  is  that  tools  for  soft¬ 
ware  management  and  quantitative  assessment  of 
the  software  process  are  In  a  state  of  rapid 
development .  Despite  severe  difficulties  with 
definitions,  taxonomies  and  data,  and  despite 
the  fact  that  the  consensus  at  the  end  of  the 
1977  Workshop  was  that  better  measurement  and 
definition  tools  were  needed  before  successful 
management  tools  could  be  developed,  the  papers 
tended  to  concentrate  on  management  tools  and 
measurement  rather  than  definitions.  Even  the 
phenomenological  papers  were  written  and  presen¬ 
ted  with  a  clear  view  toward  managerial  appli¬ 
cation.  Participants  agreed  that  the  technical 
papers  in  the  Second  Workshop  were  more  evalua- 
tive  and  less  speculative  than  those  published 
last  year.  This  was  seen  as  a  sign  of  real  pro¬ 
gress,  but  it  was  also  agreed  that  additional 
evaluations  based  on  common  definitions  and  real 
data  are  still  lacking  and  badly  needed. 


Summary  of  Findings 

The  findings  of  the  Second  Workshop  include 
the  firm  conclusions  reported  by  the  discussion 
sessions  and  the  technical  papers,  validated  by 
discussion  at  large  among  participants  and  atten¬ 
dees.  These  findings  may  be  summarized  as  fol¬ 
lows: 

1 .  Formal  life-cycle  management  tools  are 
useful  in  the  development  phase  (Boehm 
and  Wolverton  [129] , Parr  [  66] ,  Stewart  [88], 
Velez  [95],  Putnam  [195]).  Management  tech¬ 
niques  for  the  development  phase  of  all 
kinds  of  software  are  presented  by 
Dickover  [52] and  by  Naumann  and  Davis 

[63 K  Many  limitations  of  these  methods 
were  discussed;  for  example,  Naumann  and 
Davis  cite  experience  that  formal  methods 
are  more  useful  for  low-uncertainty  pro¬ 
jects  than  for  projects  involving  truly 
new  software. 

2 .  A l 1  classes  of  software  should  be  managed 
according  to  a  common  framework,  but 
different  management  procedures  should  be 
used  for  each  phase  of  the  life  cycle. 

A  specific  set  of  life  cycle  phase  defi¬ 
nitions  was  prt posed  by  the  Methodology/ 
Theory  session  group. 

3.  Overall  life-cycle  cost  curves  are  use¬ 
ful  and  promising,  but  not  yet  well  vali¬ 
dated  by  real  data  from  a  multiplicity  of 
env i ronmen t s .  Many  different  sets  of 


assumptions  give  rise  to  different  mathe¬ 
matical  forms  of  life-cycle  curves,  all 
having  similar  goodness  of  fit  to  his¬ 
torical  data.  Parr [66]  and  Thibodeau  and 
Dodson [70]  offer  alternatives  to  the 
Rayleigh  curve. 

4 .  Reliable  data  for  calibration  of  life-cy - 
cle  curves  may  not  be  available .  I n  t he 
absence  of  data-based  verification,  des¬ 
pite  the  recommendations  of  the  previous 
Workshop,  a  promising  alternative  is  to 
generate  life-cycle  curves  by  aggregating 
data  from  PERT  anal  vs is  (Boehm  and  Wol¬ 
verton  [1 29  ],  par  r  [  66] ,  Tausworthe  [156]). 
This  would  allow  more  detailed  estimation, 
and  would  sidestep  the  difficulty  that 
empirical  data  cannot  discriminate  among 
alternative  aggregated  life-cycle  curves. 

5 .  Little  progress  has  been  made  in  develop¬ 
ing  automated  management  tools  for  life 
cycle  management  of  software.  Real-time 
decision-aiding  systems  would  be  desir¬ 
able.  Velez  [95]  has  reported  an  auto¬ 
mated  way  of  expressing  a  target  system 
as  a  data  base  in  a  special  language, 
giving  a  tangible,  measurable  object 
that  exists  prior  to  writing  any  target- 
system  code  (details  not  revealed). 

Schick  and  Lin[81]  report  automation  of 

a  small  but  important  task — interactive¬ 
ly  aiding  an  expert  in  the  development  of 
subjective  probability  distribution  for 
a  random  variable.  A  PERT-type  work- 
breakdown  procedure  has  been  partially 
automated  (Tausworthe  [156]). 

6.  Lines  of  code  per  man-month  is  not  a  sat¬ 
isfactory  indicator  of  productivity  and 
should  be  replaced  by  measures  that  in¬ 
corporate  quality  and  complexity  as  well 
as  length.  Reports  of  successful  length- 
only  productivity  measurement  (Curtis  et 
alD-63,  Halstead  [174  ]  seem  to  contradict 
this  finding;  but  their  data  did  not  cross 
organizational  boundaries,  and  Halstead’s 
data  came  from  an  organization  that  en¬ 
forces  standardized  coding  complexity 

and  corrects  line-of-code  counts  for  reuse 
of  standard  code. 

7 .  Software  reliability  modeling  is  rapidly 
maturing ,  so  that  models  such  as  those  of 
Litt lewood  [146]  and  Musa  [153]  can  be  used 
routinely.  Elshoff  [172 ] reports  success 
with  an  (unspecified)  predictive  measure 
for  estimating  the  time  to  revise  a  pro¬ 
gram. 

8 .  Good  life-cycle  models  should  possess 
many  detailed  characteristics  such  as 
those  proposed  by  the  Methodology/Theory 
session  group. 
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9 .  The  key  goal  regarding  management  infor¬ 
mation  tools  ts  Increased  visibility  by 
the  manager  at  all  times.  This  implies 
a  local  terminal  able  to  report  key  pro¬ 
ject  aggregates  on  demand. 

10.  Software  life  cycle  management  problems 
are  people-oriented,  not  machine-oriented, 
according  to  the  almost -unanimous  consen¬ 
sus  of  participants. 

11 .  Little  or  no  progress  has  been  made  in 
evaluating  maintainability.  Effective 
software  life  spans  have  been  impossible 
to  estimate  in  a  good  or  explicit  way. 
Software  seems  to  possess  finite  life 
and,  thus,  must  be  redone  or  scrapped 
within  a  few  years.  It  appears  to  be 
extremely  unusual  to  pay  as  much  as  a  10 
per  cent  premium  for  maintainability. 

12 .  Changes,  modifications  and  enhancements 
should  not  be  classed  and  treated  as 

ma intenance.  Most  participants  agree, 
however,  that  perceived  software  flexi¬ 
bility  (ease  of  accomplishing  prospec¬ 
tive  modifications)  should  be  considered 
during  design  and  procurement  decisions, 
and  there  should  a* so  be  a  recognition 
during  the  development  phase  that  changes 
are  inevitable  to  keep  software  working 
while  its  environment  evolves. 


Summary  of  Recommendations 

The  recommendations  of  the  Second  Workshop 
are  the  firm  recommendations  reported  by  the 
discussion  sessions  and  the  consensus  recommenda¬ 
tions  expressed  by  the  participants  and  attendees. 
These  recommendat ions  are  summarized  as  follows: 

1 .  Researchers  and  managers  should  adopt  a 
standardized  set  of  definitions  of  terms 
in  software  life  cycle  management.  A 
task  force  or  definitions  committee  should 
be  organized  to  work  on  this  problem. 

2 .  Resources  should  be  set  aside  specifically 
to  validate,  classify  and  test  software 
management  models.  A  project  should  be 
initiated  to  produce  an  evaluative  review 
or  "catalog”  of  existing  descriptive  and 
predictive  life-cycle  models,  describing 
each  model  and  listing  its  assumptions, 
purpose,  capabilities  and  limitations. 

3.  Research  should  be  done  to  provide  real¬ 
time  automated  management  tools  for  each 
phase  of  software  life  cycle  management, 
using  automated  metrics  and  measures  that 
incorporate  past  project  histories  and 
current  project  information.  Automated 
tools  for  programmers  are  also  needed. 


4 .  Large-scale  data-based  validation  pro¬ 
jects  should  be  undertaken  to  validate 
and  refine  existing  and  proposed  models 
and  metrics  and  to  help  provide  a  basis 
for  standardization  of  data  collection 
and  model  parameters. 

5.  A  taxonomy  of  software  environments 
should  be  established,  and  research 
should  be  done  to  elucidate  distinc¬ 
tions  among  individualized  software 
environments.  A  well-founded  taxonomy 
would  allow  objective  classifications 
that  are  indispensable  in  controlling 
sources  of  variance  in  statistical 
studies  of  life  cycle  phenomena  and 
metrics . 

6 .  Better  and  more  detailed  milestone  def¬ 
initions  need  to  be  established  to  pro¬ 
vide  managers  with  objective  project 
checkpoints  that  can  be  assessed  quan- 
titat ively . 

7 .  More  transfer  of  technology  is  needed 
f rom  project  to  project  and  from  or¬ 
ganization  to  organization.  Intensi¬ 
fied  effort  is  needed,  not  only  to  pro¬ 
vide  technology-transfer  vehicles  such 
as  these  Workshops,  but  also  to  provide 
training  in  methodology  and  tools  for 
software  managers.  Effort  is  also 
necessary  to  encourage  life  cycle 
management  research  projects  to  be 
carried  out  in  conjunction  with  ongoing 
software  development  and  maintenance 
programs.  The  Measurement /Pred let ive 
discussion  group  recommends  that  soft¬ 
ware  project  managers  not  be  asked  to 
experiment  on  their  projects,  but  only 
to  allow  data  to  be  collected  in  a 
neutral  manner.  Experiments  should 
make  data  available  to  managers  to  help 
them  manage. 

Recommendations  that  (1)  represent  minority 
opinions  within  the  discussion  groups  or  (2)  rep¬ 
resent  personal  opinions  of  the  discussion  lcr-nip 
chairmen  are  omitted  from  the  above  summary  but 
are  included  in  the  session  summaries  which  icl- 
low. 
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III.  QUESTIONS  FOR 

SOFTWARE  LIFE  CYCLE  MANAGEMENT  WORKSHOP 


The  following  are  a  list  of  questions  of  interest  in  the  area  of 
Software  Life  Cycle  Management.  They  are  meant  to  generate  discussion 
and  hopefully  elicit  information  of  benefit  to  the  community.  When  con¬ 
sidering  these  questions,  keep  in  mind  both  the  process  and  the  product. 

MILESTONES: 

Are  there  better  ways  to  characterize  and  measure  progress  than  the 
standard  definitions  of  milestones  as  points  in  time?  Can  we  capture  the 
dynamics  of  the  process,  i.c.,  the  interactiveness,  the  user's  involvement? 

Are  there  different  types  of  definitions  of  progress  with  respect  to 
time  and  with  respect  to  classes  of  projects,  e.g.,  first-time  efforts, 
standard  developments,  etc.? 

BUDGET: 

What  should  the  people-loading  curve  be  across  the  life  cycle  of  a 
system?  What  effect  do  different  methodologies  have  on  the  shape  of  that 
curve?  How  do  you  decide  how  much  do  to  do  within  a  fixed  budget?  What  is 
the  effect  of  size  and  organization  on  the  budget?  What  generic  factors 
are/should  be  present  in  software  cost-estimating  formulas?  What  are  the 
techniques  for  measuring  work  accomplished  versus  budgeted  dollars? 

PRODUCTIVITY: 

Are  there  better  ways  to  measure  group  productivity  than  lines  of  code 
per  man  month?  How  can  one  measure  individual  productivity  or  productivity 
on  small  projects?  What  are  the  important  factors  for  measuring  productivity 
during  development,  during  modification? 

TOOLS : 

Where  should  we  be  going  in  terms  of  automated  tools  for  managing 
the  software  development  process  and  aiding  the  development  personnel 
(Management  information  vs.  product  generating  tools)?  What  should  tools 
encourage? 

EMPIRICAL  STUDIES: 

What  do  we  really  need  to  know  to  understand  the  process  better? 

What  Information  should  he  collected  about  the  process,  the  product  and 
their  Interaction,  and  for  what  purpose?  What  kinds  of  experiments  and 
evaluations  should  be  performed?  How  can  we  capture ^the  idea  of  program 
complexity?  How  can  program  managers  be  convinced  tC  conduct  experiments 
on  their  programs?  What  progress,  if  any,  is  bring  made  on  the  transfer  of 
learning  from  project  to  project  within  and  between  organizations? 
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MODELS : 

Give  a  set  of  criteria  for  good  predictive  models  of  the  software 
life  cycle.  How  could  statistical  and  analytical  models  be  combined? 

Is  there  a  need  for  a  "standard"  set  of  generic  models  of  the  software 
life  cycle  process? 

METHODOLOGIES: 

What  are  the  components  of  an  overall  methodology?  Where  should 
software  technology  be  going?  Are  there  ”  standard izableM  methodologies? 
What  effect  do  different  software  development  characteristics  have  on  the 
implementation  or  adaption  of  methodologies  in  practice?  How  do  you 
characterize  a  methodology?  How  many  methodologies  exist  in  practice? 

MAINTENANCE/MODIFICATION: 

Is  there  a  way  to  determine  and  measure  the  effective  life  span  for 
software  systems?  How  do  you  know  when  to  redo  a  system?  What  are  the 
design  trade-offs  for  maintainability?  At  what  point  in  the  life  cycle 
should  maintenance  considerations  be  included?  What  strategy  should  be 
used  to  transfer  software  from  developer  to  maintainer? 

MANAGEMENT : 

What  are  the  major  ingredients  in  the  management  of  software?  What 
makes  it  unique?  What  makes  it  different  from  hardware,  for  example? 

How  should  the  organizational  structure  relate  to  the  problem  to  be  solved 
and  the  different  interactive  phases  of  development ?  To  what  extent  shoaid 
managers  be  technically  trained/involved?  To  what  extent  should  technic.il 
personnel  be  managerially  trained/ involved?.  Are  there  different  classifica¬ 
tions  of  software  that  require  different  methods  of  management  (e.g., 
embedded  vs.  non-embedded) ?  What  are  they?  Are  there  predictable  crises 
in  the  software  life  cycle  and  what  are  the  early  warnings? 

ENVIRONMENT: 

Is  there  anything  different  in  the  above  with  respect  to  future  tech¬ 
nological  developments,  e.g.,  small  computer  environments,  standardized 
modules?  What  changes  need  to  be  made?  Is  there  any  "scaling"  effect? 

GENERAL: 

What  parts  of  the  above  questions  should  he  attacked  first?  What  are 
solvable  in  the  next  five  years?  What  questions  would  you  like  to  see 
asked  next  year?  What  should  research  efforts  concentrate  on?  What  are 
the  ten  top  software  life  cycle  management  terms  that  need  definition? 

What  is  your  source  authority  for  present  definitions,  e.g.,  ANSI? 
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SOFTWARE  LIFE  CYCLE  MANAGEMENT:  DYNAMICS  THEORY 


Summarized  Dy 
Dr.  John  H.  Manley 


The  Johns  Hopkins  University 
Applied  Physics  Laboratory 
Laurel.  Maryland 


Abstract 

The  Dynamics  Theory  Group  discussed 
the  conceptual  relationships  of  software 
to  a  system  life  cycle  model,  and 
management  to  a  composite  software  life 
cycle  model.  A  "standard"  software  life 
cycle  management  model  is  proposed  that  is 
a  modification  of  the  generally  accepted 
Department  of  Defense  system  life  cycle 
model.  The  group  agreed  that  there  are  at 
least  five  distinct  types  of  Army 
management  involved  in  a  major  system 
software  management  life  cycle,  and  linked 
quantitative  life  cycle  milestones  to 
elementary  decision  theory.  Other 
findings  are  reported  that  suggest  several 
profitable  areas  for  Army  software 
management  research.  A  summary  of  a 
separate  report  being  submitted  to  the 
Chairman  of  the  President's  ADP 
Reorganization  Project  National  Security 
Team  is  included  as  a  sequel  to  the  highly 
stimulating  Monday  night  banquet 
presentation.  The  report  recommends  that 
ADP  resource  management  policymaking  for 
technical  issues  be  centralized;  ADP 
resource  acquisition  management  control  be 
functionally  decentralized,  and;  a  systems 
management  approach  similar  to  that 
developed  for  embedded  computer  systems  be 
used  for  ADP  system  life  cycle  management. 


Introduction 

The  Dynamics  Theory  Group  focused  on 
the  theoretical  aspects  of  software  life 
cycle  management  (SLCM) .  We  interpreted 
our  workshop  charter  quite  literally  and 
decided  not  to  get  involved  with  the 
details  of  management  practice  or  tools 
since  it  appeared  during  the  opening 
plenary  session  that  the  other  three 
groups  would  adequately  cover  those 
aspects  of  SLCM.  Therefore,  we  tried  to 
answer  only  a  few  of  the  most  basic 
questions  that  were  central  to  our 
'  dynamics  theory"  view  of  the  world.  For 
example,  we  tried  to  determine  whether  or 
not  traditional  ways  of  modeling  the 


software  life  cycle,  to  include 
miles  toning,  could  be  improved  upon. 

We  also  learned  from  the  plenary 
session  that  the  other  three  groups 
intended  to  concentrate  on  the  Full-Scale 
Development  phase  of  the  Department  of 
Defense  (DoD)  system  life  cycle  which  is 
primarily  involved  with  the  program 
management  aspects  of  developing  software. 
Therefore,  the  predominant  activity 
addressed  by  our  group  was  management  of 
the  complete  software  life  cycle,  with 
emphasis  on  aspects  other  than  the  more 
popular  area  of  program  management  of  the 
software  development  process.  It  was  our 
feeling  that  the  Army  should  be  concerned 
with  a  wider  range  of  management  problems 
to  include: 

a.  Developing  better  ADP  and 
tactical  system  requirements  that  include 
software  as  component  parts. 

b.  Administering  (as  the  buyer) 
the  software  development  technical 
management  process  which  in  many  cases  is 
carried  out  by  outside  contractor 
organizations . 

c.  Developing  a  better  under¬ 
standing  of  how  to  economically  maintain 
software  contained  in  operational  systems, 
both  as  a  user  and  logistician. 

Thus,  our  deliberations  were  directed 
more  from  the  perspective  of  Army 
managers,  be  they  corporate,  field, 
technical,  program,  or  logistics.  We  then 
took  our  first  step  into  the  arena  of 
software  life  cycle  management  methodology 
theory  by  trying  to  identify  life  cycle 
components  and  their  relationships  based 
upon  this  Army  manager  perspective  of  the 
software  development  process. 


SLCM  Management  Differences 
Fact  or  Fancy? 

The  first  specific  question  we 
addressed  was:  Are  there  different 
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classifications  of  software  (for  example, 
embedded  versus  non-embedded  or  functional 
versus  non- functional)  that  require 
different  methods  of  management  and,  if 
this  is  true,  what  are  they? 

We  reached  the  unanimous  conclusion 
that: 

Appropriate  management  methods 
that  are  applicable  to  different 
phases  of  the  life  cycle  do  not  vary 
across  different  classes  of  software, 
but,  the  specific  management  method 
used  can  be  and  usually  is  different 
for  different  phases  of  the  software 
life  cycle. 

This  first  finding  prompted  us  to 
closely  examine  the  life  cycle  management 
process  using  a  somewhat  different 
perspective  than  has  been  customary  in  the 
past.  As  a  first  step,  we  found  that  we 
could  use  the  Department  of  Defense  (DoD) 
system  life  cycle  as  it  is  commonly 
understood  (Reference  1)  to  generally  fit 
our  individual  theories  of  what  software 
life  cycle  management  means.  However,  we 
were  forced  to  make  what  turned  out  to  be 
several  conceptually  significant 
modifications  to  that  "system"  model  in 
order  to  develop  a  useful  "software"  life 
cycle  management  working  model. 


We  recommend  that  our  proposed 
software  life  cycle  management  model  shown 
in  Figure  1  be  used  as  a  strawman  baseline 
for  a  follow-on  research  effort  by  the 
Army  to  add  conceptual  detail  to  the 
individual  life  cycle  subphases.  Since  the 
top  levels  of  life  cycle  phase  terminology 
used  in  existing  Department  of  Defense 
documentation  remain  the  same,  that  is, 
Conceptual,  Validation,  Full-Scale 
Development,  Production,  Deployment  and 
Support  phases,  we  do  not  advocate 
replacing  any  existing  documentation  but 
would  simply  modify  it  as  described  below. 

Conceptual  Phase 

As  shown  in  Figure  1,  the  Conceptual 
Phase  has  been  divided  into  two  subphases, 
Conceptual  Requirements  Definition  and 
Conceptual  Requirements  Validation. 

Definition  Subphase.  Most  front  end 
major  system  requirements  development 
activity  is  performed  by  DoD  functional 
user  or  field  organizations  for  major 
systems.  In  fact,  field  commands  usually 
employ  relatively  large  groups  of  people 
(development  planners)  who  perform  this 
type  of  analysis  on  a  continuing  basis. 
Since  they  necessarily  possess  almost  a 
purely  functional  systems  orientation,  anv 
ADPE  (Automatic  Data  Processing  Equipment) 
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Fig.  1  Software  life  cycle  management  model. 
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included  in  such  systems  is  almost  always 
embedded  in  the  classic  sense  of  the 
original  definition,  that  is: 

"...a  computer  can  be  considered 
to  be  of  the  embedded  variety  when  it 
is : 

1,  physically  incorporated 
into  a  larger  system  whose  primary 
function  is  not  data  processing;  and 

2.  integral  to  such  a 
system  from  a  design,  procurement  and 
operations  viewpoint.  1  (Reference  2) . 


It  is  important  to  note  that  such  work 
does  not  usually  involve  software 
requirements  analysis  at  this,  the  very 
front  end  of  the  greater  system  life  cycle 
subcycle,  but  nevertheless  always 
represents  the  inital  activity  on  first¬ 
time  systems. 

The  term  "greater  system"  will  be 
referred  to  subsequently  and  requires 
clarification.  In  the  context  of  embedded 
computer  resources  in  the  tactical  or 
defense  systems  world,  the  greater  system 
refers  to  the  tank,  aircraft  or  missile 
that  contains  the  embedded  computers, 
computer  programs  and  computer  data  as 
component  parts.  We  will  show  later  that 
this  same  interpretation  can  also  be 
conceptually  applied  to  greater  (data 
processing)  systems  such  as  supply, 
personnel,  transportation,  payroll  and  so 
forth. 


Validation  Subphase .  The  Conceptual 
Requirements  Validation  subphase  comes 
next.  It  is  distinct  from  the  preceeding 
subphase  in  that  additional  players  are 
involved  during  the  important  paper 
feasibility  study  activities.  When  deemed 
necessary  (always  for  major  defense 
systems)  these  feasibility  studies  are 
usually  carried  out  jointly  by  both  the 
using  (field)  and  development  commands. 
The  development  conmands  are  three  of  the 
Joint  Logistic  Conmands  (JLC)  consisting 
of  the  Army's  Materiel  Development  and 
Readiness  Command  (DARCOM) ,  the  Air  Force 
Systems  Command  (AFSC)  and  the  Naval 
Material  Command  (NMC  or  NAVMAT) .  These 
conceptual  system  feasibility  studies 
seldom  involve  outside  contractors  and 
usually  have  high  military  security 
protection,  thus  being  quite  invisible  to 
both  the  academic  and  comnercial  software 
communities . 


Thus,  we  see  the  conceptual  phase  as 
necessarily  consisting  of  two  distinct 


subphases.  Conceptual  Requirements 
Definition  and  Conceptual  Requirements 
Validation.  As  will  be  explained  later, 
the  changes  in  the  participant  mix  from 
the  first  subphase  to  the  next  has  a 
direct  bearing  on  management  of  the 
software  life  cycle. 


Validation  Phase 

The  Validation  Phase  is  essentially 
one  in  which  the  DoD  validates  the 
solution  to  the  previously  validated 
reater  system  requirement.  This  is  the 
irst  phase  where  software  is  generally 
acknowledged  explicitly  as  a  system 
resource.  Specifically ,  in  this  phase  the 
program  characteristics  of  performance, 
cost  and  schedule  are  validated  and 
refined  through  extensive  study  and 
analysis,  actual  hardware  development,  or 
possibly  prototype  testing.  The  main  idea 
here  is  that  hardware  development  and 
evaluation  may  provide  corporate 
management  (Service  Staff,  Office  of  the 
Secretary  of  Defense  and  the  Congress) 
with  a  better  definition  of  program 
characteristics,  higher  confidence  that 
risks  have  been  resolved  or  minimized,  and 
a  greater  confidence  in  the  ultimate 
outcome  than  could  the  paper  studies 
generated  during  the  preceeding  Conceptual 
Phase. 

In  this  second  major  phase,  an  intial 
Program  Office  cadre  is  expanded  to  a  full 
program  office  and,  if  a  major  system  is 
involved,  software  becomes  an  item  of 
specific  interest  as  directed  by  Secretary 
of  Defense  level  instructions  (for 
example,  see  Reference  3). 


Full-Scale  Development  Phase 

During  the  next  major  life  cycle 
phase,  the  system  including  all  of  its 
support  items  is  designed,  fabricated  and 
tested.  The  intended  output  is,  as  a 
minimum,  a  preproduction  system  that 
closely  approximates  the  final  product, 
the  documentation  necessary  to  enter  the 
Production  Phase,  and  the  test  results 
that  meet  requirements.  Since  software 
can  be  replicated  precisely,  this  phase  is 
the  most  important  with  respect  to  the 
development  of  quality  software,  and  where 
most  of  the  emphasis  has  been  placed  to 
date  (as  opposed  to  hardware  where 
production  problems  can  be  severe  and  very 
costly) . 


Production  Phase 

When  more  than  one  copy  of  a  system 
must  be  produced,  this  phase  becomes 
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important,  especially  if  production  of 
copies  is  to  be  carried  out  over  a  period 
of  years  as  in  the  case  of  major  defense 
systems.  One  of  the  most  difficult  areas 
of  management  in  this  phase  involves 
change  control,  with  its  difficulty  being 
directly  proportional  to  the  amount  of 
change  activity.  The  same  methods  of 
planning,  development  and  testing  should 
be  followed  to  make  system  changes  during 
production  runs  as  were  used  previously  in 
the  Full-Scale  Development  phase. 

When  producing  copies  of  software 
systems  for  multiple  users,  the  same 
principles  apply  and  a  quite  close  analogy 
to  hardware  methods  can  be  conceptualized. 
Latent  defects  will  emerge  during  this 
phase  and  the  user  will  continue  to 
require  changes  to  the  system,  both  in 
hardware  and  software.  Each  change  must 
be  treated  as  a  mini  full-scale 
development,  both  in  principle  and  in 
fact,  especially  with  regard  to  management 
methods. 


Deployment  Phase 

This  phase  has  been  interpreted  and 
defined  somewhat  differently  by  each  of 
the  three  military  services  and  is 
consequently  difficult  to  describe  in 
terms  of  a  "standard"  software  life  cycle 
model.  In  general,  this  phase  includes 
events  generated  when  final  copies  of  new 
systems  are  actually  put  into  operational 
use  in  field  organizations.  We  feel  that 
additional  emphasis  should  be  placed  on 
the  management  activities  in  this  phase, 
especially  with  respect  to  the  problems  of 
transfer  of  development  management 
responsibility  to  the  support  management 
team.  It  is  an  extremely  difficult 
problem  to  determine  precisely  the  point 
when  any  system  (or  copy  thereof)  becomes 
truly  "operational."  We  recommend  that 
the  Army  follow  up  on  this  point  with 
further  research  to  develop  an  effective 
system  (software)  deployment  strategy.  To 
provide  a  starting  point,  we  offer  the 
following  description  of  two  proper 
subphases  which  we  feel  must  be  included 
as  a  minimum. 

Debugging  Subphase.  At  the  beginning 
of  the  beployment  Phase  when  a  new  system 
is  produced  and  first  handed  over  to  the 
using  organization,  the  embedded  software 
will  necessarily  contain  latent  defects. 
This  means  that  the  initial  activity  in 
the  phase  will  involve  bug  removal,  or 
simply  trying  to  get  the  system  to  run 
smoothly  in  the  user's  environment.  We 
call  this  the  Debugging  subphase.  This 
does  not  mean,  however,  that  when  most 
bugs  are  removed,  the  system  will  fully 
meet  current  user  requirements,  even 


though  it  has  been  proven  to  meet  the 
original  systems  specification  through 
extensive  testing. 

Fine  Tuning  Subphase.  Thus,  a  second 
distinct  subphase  we  call  Fine  Tuning  is 
necessary.  This  involves  tailoring  the 
system  to  meet  current  user  requirements. 
When  one  reflects  on  the  length  of  time 
involved  for  systems  to  pass  through  the 
Full-Scale  Development  and  Production 
phases,  it  is  small  wonder  that  they 
usually  do  not  meet  "current"  user's 
requirements  that  have  evolved  over  a 
period  of  perhaps  2,  5  or  10  years.  Only 
when  this  fine  tuning  is  completed 
however,  should  the  system  be  allowed  to 
enter  what  we  generally  understand  as  the 
Support  Phase.  But,  how  does  one 
determine  when  this  activity  is  truly 
finished?  Again,  we  recommend  the  Army 
focus  a  significant  research  effort  on 
this  critical  problem  area. 

In  any  event,  the  end  point  of  the 
Deployment  Phase  is  quite  clear,  the 
system  and  all  of  its  associated  software 
has  been  "thrown  over  the  wall"  to  the 
logistics  or  support  manager. . .ready  or 
not. 


Support  Phase 

As  has  been  reported  in  this  workshop 
and  elsewhere,  the  Support  Phase  is  now 
regarded  as  the  highest  cost  phase  with 
respect  to  the  total  software  life  cycle. 
It  is  the  one  on  which  we  need  to  focus 
most  of  our  attention  if  we  ever  hope  to 
significantly  reduce  software  life  cycle 
costs.  This  is  also  the  phase  in  which 
most  of  the  ADP  or  non-embedded  computer 
system  action  resides.  In  short,  the 
Support  Phase  is  where  most  of  our  money 
is  spent  and  where  the  Army  should 
concentrate  its  research  efforts  at 
finding  ways  to  reduce  software  support 
costs. 

In  view  of  its  importance,  the 
Dynamics  Theory  Group  spent  most  of  its 
time  discussing  this  phase  and  its 
software  life  cycle  management 
implications.  We  began  by  defining  what 
activities  should  be  carried  out  in  the 
Support  Phase.  We  decided  that  software 
support  should  include  at  least  the 
following  continuing  tasks: 


a.  Continuing  correction  of 
latent  bugs  and  technical  deficiencies  in 
software  as  they  are  discovered. 

b.  Making  system  changes  due  to 
modifications  to  equipment  (hardware)  that 
fails,  wears  out  or  is  replaced  for  other 
technical  reasons. 


c.  Any  other  external  impact 
forcing  a  software  change  that  does  not 
affect  user  greater  system  functions. 

These  activities  are  what  we  consider 
to  constitute  normal  software  maintenance 
in  a  quite  general  context. 

Modifications,  however,  will  also 
occur  for  other  reasons.  The  main  one 
being  that  the  user  requires  a  change  to 
the  functional  characteristics  of  his 
greater  system.  If  such  a  modification 
involves  changing  software  (or  hardware) , 
a  quite  different  type  of  support  activity 
will  be  required. 

Once  the  support  manager  begins  to 
invoke  this  last  type  of  change  process, 
he  is  changing  more  than  the  data 
processing  components  and,  unfortunately, 
this  fact  is  not  always  recognized  in  the 
professional  community.  Whenever  the 
greater  system  is  impacted,  the  life  cycle 
begins  a  new  subcycle  that  is  very  similar 
to  jumping  back  to  the  Full-Scale 
Development  phase  and  possibly  to  an  even 
earlier  phase.  Since  this  means  that  we 
must  reapply  project  management  methods 
through  the  modification  development, 
debugging  and  fine  tuning  processes,  we 
have  defined  this  continuing  cyclical 
activity  as  a  separate  subphase  called 
System  Modification. 


System  Disposal 

Eventually,  whether  it  be  six  months 
or  fifty  years,  when  corporate  management 
decides  that  the  greater  system  is  no 
longer  needed  by  the  organization,  it  is 
removed  from  the  active  system  inventory, 
thus  ending  the  software  management  life 
cycle. 


Management  Relationships 
“  to  the 

Modified  Software  Life  Cycle 


As  shown  in  Figure  1,  five  distinct 
types  of  management  are  involved  during 
the  different  phases  and  subphases  of  the 
life  cycle.  Their  relationships  to  the 
phases  and  also  to  each  other  are  briefly 
described  below. 


Conceptual  Phase  Management 

From  this  very  first  life  cycle  phase 
through  to  the  ultimate  disposal  of  the 
system,  the  functional  manager  is  (or 
should  be)  involved.  He  is  the  first  one 
to  organize  a  team  to  develop  the  initial 
greater  system  requirements.  He  oversees 
his  system  during  its  development, 


receives  it  in  the  field  and  operates  it 
until  its  disposal.  This  type  of 
management  is  primarily  concerned  with  how 
the  greater  system  performs  its  military 
mission.  All  computer  resources  are 
treated  as  embedded,  that  is,  as  component 
parts  of  the  greater  system.  The 
functional  manager's  continuing  bottom 
line  activity  is  to  plan,  develop  ani[ 
employ  systems  as  necessary  to  achieve  hi • 
operational  objectives. 

Whenever  a  new  system  or  major  system 
modification  concept  defined  by  the 
functional  manager  includes  future  or  even 
state-of-the-art  technology,  scientific 
and  engineering  people  are  usually 
employed  in  the  Requirements  Concept 
Validation  subphase  to  perform  technical 
feasibility  studies,  and  later  to  develop 
any  required  new  technology.  It  is 
generally  recognized  that  day-to-day 
management  of  scientists  and  engineers  is 
quite  different  than  so  called  "line 
management"  of  personnel  who  operate 
greater  systems  in  the  field.  There  are, 
in  fact,  different  curricula  in  most  major 
universities  that  address  engineering 
management/administration  separately  from 
business  management/administration.  Thus, 
it  must  be  recognized  that  in  the  early 
stages  of  the  software  life  cycle, 
technical  management  as  a  subset  of 
general  management  is  often  involved. 

Validation  Phase  Management 

Once  the  system  begins  the  Validation 
Phase,  planning  for  system  development 
begins  in  anticipation  of  an  approval  for 
system  full-scale  development  and  eventual 
production.  It  is  here  that  the  type  of 
program  office  is  defined,  the 
determination  of  the  type  of  contracts  to 
be  used,  decisions  on  which  components 
will  be  developed  in-house  and  which  out¬ 
house,  the  type  and  level  of  expertise  the 
program  manager  should  have  and  so  on. 
Once  these  preliminary  matters  are 
determined  by  corporate  management,  a 
program  manager  is  hired  and  a  program 
office  cadre  assembled.  This  is  the  point 
when  program  management  as  a  separate  type 
is  begun  as  an  addition  to  the  existing 
functional  area  management  and  technical 
management  teams  already  on  board. 

The  distinguishing  feature  of  program 
management  as  a  type  is  that  the  program 
manager  must  live  within  three  basic 
constraints : 

a.  A  "fixed"  budget. 

b.  An  "inflexible"  schedule. 

c.  A  "constantly  changing" 
specification  of  the  greater  system  he  is 
chartered  to  build. 
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This  type  of  management  is  the  one  we 
generally  talk  about  at  workshops  and 
conferences  such  as  this,  usually  to  the 
exclusion  of  the  other  types  shown  in  the 
expanded  life  cycle  model.  This  is  not 
reant  to  imply  that  the  program  management 
for  a  new  system  development  or  major 
modification  to  an  existing  system  is  not 
a  most  critical  occupation.  However,  our 
group  recognized  that  this  is  definitely 
not  the  only  type  of  management  involved 
in  the  overall  software  life  cycle. 


Full-Scale  Development 
Phase  Management 

Sometime  during  the  development  phase, 
logistics  people  must  get  involved  to 
insure  that  the  greater  systems  being 
developed  will  be  delivered  as 
"maintainable"  during  the  operations 
phase.  Once  the  logisticians  get  on  board 
as  shown  in  Figure  1,  they  remain  involved 
with  the  system  until  its  ultimate 
disposal. 


Product ion/Sup port 
Phase  Management 

When  the  system  has  been 
developed  to  the  point  that  the  major  bugs 
have  been  removed  and  it  has  been  fine 
tuned  to  satisfy  the  user,  responsibility 
for  the  system  is  formally  transferred 
from  the  program  manager  to  the  logistics 
manager  and  the  user  for  the  remainder  of 
the  life  cycle.  As  mentioned  before,  the 
system  can  be  changed  (sometimes  quite 
drastically)  by  the  logistics  manager  with 
no  outward  appearance  of  change  to  the 
user  in  functional  characteristics.  When 
the  user  requires  a  change,  however,  both 
the  user  and  the  logistician  must  work 
together  with  a  project  manager  to 
eventually  develop  a  new  system,  that  is. 
System  XYZ -Model  2,  -Model  3,  -Model  4  or 
-Model  n. 

Notice  that  we  make  the  distinction 
between  "program"  and  "project"  managers. 
The  program  manager  is  the  one  involved 
with  the  original  development  of  the 
greater  system.  A  succession  of  project 
managers  are  those  responsible  for  making 
modifications  to  the  delivered  greater 
system  during  the  Support  Phase  of  the 
system  life  cycle.  However,  both  program 
and  project  managers  use  the  same  tools 
and  techniques  to  perform  their  Important 
functions. 


Corporate  Software  Life  Cycle  Management 

Of  course,  overseeing  all  of  the 
management  types  mentioned  thus  far  are 


the  top-level  decision  makers  involved 
with  deciding:  (a)  when  the  system  can 
proceed  from  one  major  phase  to  the  next 
(Defense  System  Acquisition  Review 
Council) ,  (b)  whether  or  not  the  system 

will  be  funded  from  year  to  year 
(Congress) ,  (c)  whether  a  particular 

computer  will  be  approved  for  purchase 
(General  Services  Administration)  ,  (d) 

whether  the  system  is  needed  to  fulfill 
greater  National  security  needs 
(President,  Secretary  of  Defense,  Service 
Secretaries,  Joint  Chiefs  of  Staff  and  the 
Service  Staffs)  and  so  forth.  We 
consolidated  all  of  these  top-level 
managers  under  the  global  title  of 
corporate  management. 


Army  Management  Implications 

The  implications  for  Army  management 
that  stem  from  our  life  cycle  model  and 
the  concomitant  management  relationships 
described  above  are  contained  in  this 
section  of  our  report.  Our  discussions  in 
developing  the  modified  software 
management  life  cycle  model  generated  many 
digressions,  some  of  which  resulted  in 
theoretical  conclusions  that  we  felt 
warranted  reporting,  while  others  did  not. 
Thus,  the  following  topics  are  offered  as 
a  selection  of  those  we  feel  are  worth 
consideration  by  the  Army  as  potential 
topics  for  further  research. 


Life  Cycle  Cost  Implications 

Upon  close  examination  of  a  variety  of 
Army  and  other  automated  systems,  we  found 
that  we  could  easily  fit  any  kirtd  of  a 
greater  system  into  the  modified  life 
cycle  model  described  above.  Clearly, 
major  defense  systems  such  as  missiles, 
tanks,  aircraft,  command  and  control 
systems ,  ships  or  the  like  that  contain 
embedded  computer  resources  pass  through 
every  phase  we  have  described.  For 
example,  one  speaker  commented  in  the 
opening  plenary  session  that  the  B-52 
bomber's  original  conceptional  requirement 
was  developed  in  approximately  1948  and 
the  system  deployed  in  approximately  1952. 
Since  that  time,  it  has  undergone  many 
major  modifications.  He  stated  quite 
accurately  that  it  is  highly  unlikely  that 
anyone  at  the  beginning  of  that  program 
could  have  foreseen  that  the  B-52  would 
still  be  flying  today,  30  years  after 
system  concept  definition. 

The  management  implication  of  such 
long  term  continuing  modifications  to  a 
major  system  is  that  their  cost  over  the 
life  cycle  cannot  be  planned  or  even 
speculated  with  any  degree  of  reasonable 
accuracy.  Thus,  our  group  concluded  that: 


The  planned  cost  of  software 
embedded  In  a  system  should  terminate 
with  the  initial  entry  of  the  system 
into  the  System  Modification  subphase 
of  the  Support  Phase  of  its  life 
cycle. 

From  that  point  on,  each  new 
modification  should  be  treated  as  a 
separate  project  to  be  independently 
managed  over  a  mini-life  cycle. 

Another  example  closer  to  the 
Automatic  Data  Processing  community  is  the 
automated  post  or  base  supply  system  that 
is  comnon  throughout  the  Department  of 
Defense.  The  Conceptual  Phase  for  the 
automated  supply  system  began  in  the 
1950' s  and  the  system  became  fully 
operational  in  the  1960's.  Since  then, 
the  system  has  been  in  the  Support  Phase. 
This  perspective  of  that  greater  system, 
that  is,  the  automated  base  supply  system, 
indicates  that  it  may  never  reach  the 
Disposal  milestone.  Thus,  we  have  had  for 
the  past  decade  a  succession  of  both 
maintenance  actions  and  system 
modifications  which  have  been  successfully 
carried  out  as  they  became  necessary.  In 
no  case,  however,  has  a  totally  new  supply 
system  emerged  which  required  starting 
back  at  the  beginning  of  a  complete 
greater  system  life  cycle. 


Embedded  Computer  Resource 
Management  Imolications 


Our  finding  above  that  the  supply 
system  is  conceptually  the  same  as  the 
B-52  with  respect  to  its  embedded  computer 
resources  has  another  implication.  As 
some  of  the  original  greater  system 
embedded  computers  wear  out  and  have  to  be 
replaced,  they  fall  under  the  Maintenance 
subphase.  If  the  original  software  is 
captured  through  emulation  on  replacement 
machines,  we  have  not  really  changed  the 
system  in  the  eyes  of  the  functional  user. 
However,  if  we  make  a  modification  to 
satisfy  new  or  emerging  user  requirements, 
then  we  go  through  the  project  management 
and  the  development  life  cycle  as  far  back 
as  the  change  warrants. 

Upon  closer  examination  of  the 
differences  between  so-called  embedded 
computer  systems  and  non-embedded  or 
general  purpose  automatic  data  processing, 
we  found  that  the  basic  distinctions 
between  these  two  categories  were  as 
follows : 

a.  There  is  a  definite 
difference  in  procurement  methodology  in 
the  DoD  between  embedded  computers  and 
commercially  available  ADPE.  In  fact, 
this  area  of  of  such  great  importance  it 


will  be  discussed  separately  as  the  next 
major  management  implication  titled 
"Procurement  Management  Implications." 


b.  There  is  another  major 
difference  between  embedded  and  non- 
embedded  computer  systems  with  respect  to 
management.  This  is  that  the  embedded 
computer  system  is  normally  involved  in 
the  complete  life  cycle  model  as  developed 
by  our  group.  Most  ADP  management  efforts 
are  only  involved  with  the  Support  Phase 
of  the  life  cycle,  since  major  ADP  systems 
such  as  supply,  personnel,  finance, 
inventory  control  and  so  forth  were 
originally  developed  10  to  20  years  ago. 
Current  software  development  programs  are 
either  of  the  maintenance  or  modification 
variety  and  do  not,  in  most  cases,  involve 
new  functional  systems  that  begin  in  the 
Conceptual  Phase. 


Procurement 

Management  Implications 


We  stated  above  that  the  methodology 
for  procuring  Automatic  Data  Processing 
Equipment  (ADPE)  is  different  than  that 
used  for  procuring  Embedded  Computer 
Systems  (ECS)  in  the  Department  of 
Defense.  Thus,  the  answers  to  the 
questions  of  whether  or  not  ECS  or  ADP 
systems  require  different  types  of 
management  practice  or  different  types  of 
life  cycles  are  simply  yes  and  no, 
respectively. 

The  distinction  that  has  been  made 
between  these  two  categories  of  automated 
systems  was  a  direct  result  of  differences 
in  procurement  regulations  stemming  from 
the  1965  "Brooks  Act"  (Reference  4).  The 
Armed  Services  Procurement  Regulation 
(ASPR)  is  used  to  procure  defense  systems 
and,  in  most  cases,  any  computers  embedded 
in  such  systems.  These  are  excluded  from 
the  ADPE  procurement  regulations  that 
implement  Brooks  Act  guidelines,  as 
administered  by  the  Office  of  Management 
and  Budget  (0MB)  and  the  General  Services 
Administration  (GSA) . 

Implementation  of  instructions 
concerning  "GSA-controlled”  ADP  computers 
on  one  hand,  and  "excluded"  embedded 
computers  on  the  other,  has  resulted  in 
two  separate  series  of  directives  and 
instructions  at  the  Office  of  the 
Secretary  of  Defense  level.  The  ADPE  or 
controlled  computer  procurements  fall 
under  the  purview  of  the  Assistant 
Secretary  of  Defense  (Controller)  and  are 
controlled  under  the  "4000  Series" 
instructions.  The  excluded  embedded 
computers  fall  under  the  jurisdiction  of 
the  Office  of  the  Undersecretary  of 
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Defense  for  Research  and  Engineering. 
They  use  the  ASPR  for  most  _ADPE 
procurement  actions  as  governed  by  "5000 
Series"  instructions. 

To  complicate  matters  further,  each 
service  has  developed  its  own  unique 
method  for  implementing  the  4000  and  5000 
Series  regulations  quite  independently  of 
one  another.  Hence,  the  answer  to  the 
original  question  is  yes,  embedded 
computer  systems  (ECS)  and  ADP  systems  do 
require  different  types  of  management,  but 
not  different  types  of  life  cycles. 

This  creates  a  problem  of 
qualification  for  the  managers  involved  in 
the  life  cycle  of  the  system.  In  no  way, 
however,  does  this  change  the  model 
presented  as  the  procurement  method  is 
only  a  policy  guideline  for  management  tc 
follow. 

Functional  Manager  Involvement 

Now  what  did  we  learn  from  an 
examination  of  this  revised  life  cycle 
model?  First,  we  found  that  functional 
managers  must  stay  involved  in  the 
software  aspects  of  their  greater  systems 
throughout  the  life  cycle  which,  as  we 
have  seen,  can  be  an  extremely  long  time. 
They  cannot  abrogate  this  responsibility. 
The  implication  here  is  that  there  must  be 
an  office  of  primary  responsibility 
established  for  that  system  and  the 
responsibility  carefully  transferred 
between  the  inevitable  succession  of 
responsible  incumbents.  This 
responsibility  for  the  functional 
operation  of  that  system  must  have  a 
continuous  thread  throughout  the  system 
life  cycle. 


Software  Management  Interactions 

We  have  shown  previously  that  there 
are  really  five  distinct  types  of  Army 
management  involved  in  the  complete  life 
cycle  of  software  that  is  a  component  part 
of  any  automated  system.  The  implications 
of  having  five  types  of  management  we 
found  to  be  worth  investigating  in  some 
detail. 

C Ai”  model  illustrates  that  one  of  the 
most  difficult  problems  for  a  program 
manager  during  the  Full-Scale  Development 
phase  of  a  major  system  is  that  he  must 
respond  to  the  continuous  management 
oversight  of  the  functional  or  line  manger 
and  his  changing  requirements.  Program 
managers  would  always  reconmend,  if  they 
had  the  choice ,  ' that  the  user  or 
functional  manager  stay  out  of  their  hair 
during  the  development  phase. 


In  addition  to  the  curse  of  che  user 
and  his  changing  requirements,  the  program 
manager  also  has  a  problem  if  new 
technology  is  involved  in  that  the 
technical  manager  will  want  to  insure  that 
any  innovations  developed  by  his 
scientific  and  engineering  people  are 
correctly  incorporated  into  the  new 
system. 


But  that’s  not  all.  The  program 
manager  has  yet  another  individual  to 
answer  to,  the  logistics  manager  who  is 
looking  for  better  and  more  complete 
documentation,  well  defined  system 
interfaces,  modularized  architecture,  and 
evidence  of  the  use  of  modern  progranming 
practices  or  structured  techniques  so  that 
the  system  will  be  easy  to  maintain. 

In  spite  of  the  problems  these 
overseers  may  impose  upon  the  program 
manager  from  time  to  time,  he  must 
incorporate  their  "parochial  views”  into 
his  program  plans.  The  fifth  set  of 
managers,  corporate,  are  responsible  for 
insuring  that  this  is  done  in  such  a  way 
to  best  serve  the  interests  of  the  overall 
organization,  be  it  the  Army,  Department 
of  Defense  or  the  Country. 

In  short,  we  feel  that  this 
conceptually  explains  both  the  necessity 
for  five  interacting  types  of  software 
management  and  the  difficulty  program 
managers  have  in  coping  with  this 
situation  during  the  software  life  cycle. 


Qualifications  and  Exceptions 

There  are,  of  course,  qualifications 
and  exceptions  to  the  model  presented  in 
Figure  1.  Some  of  the  more  important  ones 
that  should  be  considered  are  summarized 
below. 

Degree  of  Structuredness.  The  degree 
°f  risk  or  novelty  In  a  software 
development  project  or  its  "degree  of 
structuredness"  must  be  considered  when 
using  the  model.  Those  projects  which  are 
very  similar  to  others  that  have  been  done 
before,  or  those  involving  modifications 
to  major  systems  generally  do  not  require 
sufficient  numbers  of  scientists  and 
engineers  to  be  employed  to  warrant  the 
special  aspects  of  technical  management. 
On  the  other  hand,  a  major  project  such  as 
SAFEGUARD  or  APOLLO  would  obviously 
require  many  technical  managers  to  handle 
the  large  numbers  of  scientists  and 
engineers  working  on  the  system  in  their 
attempts  to  advance  technology  state  of 
the  art. 
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Prolect  Size.  Much  time  has  been 
devoted  in  this  and  many  other  conferences 
to  describing  differences  in  life  cycle 
management  based  upon  problems  of  scale. 
For  example,  a  short  term,  one-man  project 
to  design  and  develop  applications 
software  to  solve  an  ad  hoc  engineering 
problem  may  have  an  entire  life  cycle  as 
short  as  a  few  months.  Clearly,  the  five 
varieties  of  management  described  above 
would  not  all  be  used.  However,  major 
defense  systems  that  must  undergo  the 
Defense  System  Acquisition  Review  Council 
(DSARC)  process  normally  involve  all  five 
types  of  management  and  have  life  cycles 
as  long  as  20  to  30  years. 

It  is  sufficient  to  say  that  there  is 
a  drastic  difference  between  a  one-year 
development  for  a  small  system  versus  the 
30-year  life  cycle  of  a  major  defense 
system. 


Reliability  Requirements.  Software 
reliability  for  systems  such  as  a 
management  information  system  that 
operates  occasionally  using  only 
historical  information  is  clearly  not  as 
critical  as  for  real  time,  operational 
systems  involving  missile  guidance, 
nuclear  safety  or  life  support.  In 
general,  our  group  feels  that,  as  system 
software  reliability  requirements 
increase,  the  need  for  more  of  the  five 
types  of  management  involvement  also 
increases,  thus  lengthening  the  overall 
software  management  life  cycle. 

Other  qualifications  can,  of  course, 
be  added  to  those  described  above. 
However,  during  the  course  of  our 
deliberations  we  could  not  think  of  any 
qualification  or  exception  that  would 
negate  the  life  cycle  model  as  we  have 
described  it.  That  is  not  to  say  that  one 
or  more  could  not  be  uncovered  through  a 
more  thorough  analysis. 


SLCM  Milestones 

Another  major  topical  area  discussed 
by  the  Dynamics  Theory  Group  involved  the 
milestones  that  delimit  phases  and 
subphases  of  the  software  management  life 
cycle  model  described  above. 


Purpose  and  Nature  of  Milestones 

The  first  question  we  asked  ourselves 
was:  Who  are  these  milestones  for?  Our 
conclusion  was  that  this  question  can  have 
only  one  logical  answer.  Software  life 
cycle  management  milestones  should  be 
designed  for  the  direct  use  of  the  five 
types  of  management  decision  makers 
described  above. 


This  led  to  a  further  conclusion  that 
any  milestone  in  a  software  life  cycle 
management  model  must  be  a  point  of 
measurement  at  which  information  on  the 
state  of  the  life  cycle  process  is 
collected  for  the  sole  purpose  of  serving 
as  an  input  to  help  solve  a  specific 
management  decision  problem.  It  is 
significant  that  this  is  in  concert  with 
the  principles  of  elementary  decision 
theory  which  can  be  used  to  great 
advantage  in  improving  all  levels  of 
software  life  cycle  management.  Thus,  we 
emphasize  that: 

Milestones  should  represent  the 
termination  of  specific  activities  or 
tasks  and  also  provide  a  measure  of 
the  degree  of  completeness  (or 
quality)  of  those  activities.  Hence, 
they  must  be  quantitative  to  be  useful 
to  management  for  judging  the  progress 
of  a  system  through  its  life  cycle. 
If  they  are  not  quantitative  in 
nature,  they  simply  cannot  adequately 
support  this  necessary  measurement 
function. 


Milestone  Taxonomies 

A  secondary  question  as  to  whether  or 
not  there  are  or  can  be  various 
classifications  or  taxonomies  of 
milestones  we  found  not  to  be  relevant  to 
our  other  findings.  Even  though  a  manager 
who  is  interested  in  controlling  a  budget 
might  use  a  different  set  of  milestones 
than,  say,  a  technical  manager  who  is 
interested  in  keeping  track  of  progress  in 
the  engineering  development  of  a  system, 
we  feel  that  at  some  point  in  the 
management  hierarchy  there  exists  (or 
should  exist)  a  single  program  manager  who 
oversees  the  greater  system  that  would  be 
interested  in  all  of  the  milestones 
regardless  of  how  they  are  classified. 


Iterative  Process  Milestones 

One  interesting  conclusion  reached  by 
our  group  came  after  a  discussion  with 
respect  to  the  meaning  of  milestones  in  an 
iterative  environment.  This  environment, 
for  example,  is  one  in  which  a  computer 
program  keeps  cycling  back  through  its 
development  phases  perhaps  because  it  has 
not  passed  an  operational  test.  This 
could  be  a  computer  program  that  has  been 
certified  to  be  bug  free  and  is  in  fact 
running  satisfactorily,  but  has  not  yet 
completed  the  Fine  Tuning  subphase  of  the 
Deployment  Phase. 
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The  milestone  that  indicates 
termination  of  the  Fine  Tuning  subphase 
must  have  associated  with  it  a  set  of 
metrics  which  are  used  to  judge  the 
quality  of  that  computer  program.  This 
quantitative  data  is  used  by  a  management 
decision  maker  to  help  him  decide  when 
that  computer  program  is  ready  to  proceed 
into  the  Modification  subphase  of  the  life 
cycle. 

Implications  of  Milestones 
for  Decision  Making 

Figure  2  illustrates  a  toy  model  of  a 
decision  tree  which  contains  three 
possible  states  of  nature  that  might  occur 
at  a  milestone  in  a  software  life  cycle. 
Branch  A  indicates  that  the  software  is  of 
good  quality.  Branch  B  fair  quality,  and 
Branch  C  poor  quality. 

If  the  computer  program  is  judged  to 
be  of  good  quality  at  that  milestone,  and 
the  decision  maker  is  viewed  as  sitting  at 
Node  1,  he  might  be  faced  with  only  a 
single  alternative  with  respect  to  what 
the  next  directive  to  issue  will 
be. .. "continue  on." 


If  the  computer  program  has  not  passed 
the  criteria  established  for  that 
milestone  and  is  considered  bad,  the 
program  manager  can  be  viewed  as  sitting 
at  Node  3.  Now  he  is  faced  with  multiple 
alternatives  to  decide  upon.  For  example, 
should  the  program  be  killed?  Should  the 
program  be  allowed  to  proceed  while  that 
particular  computer  program  is  sent  back 
for  more  work  and  so  on? 

Similarly,  if  we  take  the  middle  case 
at  Node  2  where  a  computer  program  is 
determined  to  be  marginal  based  upon  a  set 
of  milestone  quantitative  criteria,  there 
again  would  be  several  alternatives  from 
which  the  decision  maker  must  choose. 

When  we  reflect  upon  this  type  of  a 
model,  we  can  see  the  striking  resemblance 
to  the  program  progress  review  briefings 
given  to  corporate  managers  by  program 
managers  with  respect  to  various  aspects 
of  their  programs.  It  is  common  practice 
to  show  categories  such  as  budget, 
software,  hardware,  organization  and 
schedule  in  red,  yellow  or  green  colors 
indicating  "in  trouble,"  "potential 
problems"  or  "no  problems,"  respectively. 


Start  of  task 
or  activity 

Path  of  accomplishment 

Completion 

milestone 

Management  decision 

Start  of  next 
task  or 

. 

activity 
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What  this  really  means  in  terms  of 
decisions,  decisionmakers,  and  software 
life  cycle  management  is  simply  this,  a 
milestone  is  meaningless  unless  it  has 
associated  metrics.  The  metrics  must  be 
such  that  they  provide  meaningfull 
information  to  the  manager  at  whatever 
level  so  that  he  can  make  appropriate 
decisions,  that  is,  choose  the  best 
erceived  alternative  course  of  action 
ased  upon  the  information  presented. 

Milestone  Metric  Research 

Now  the  basic  question  remains:  How 
do  we  measure  attributes  of  software  such 
as  quality  or  completeness  at  required 
milestones? 

Some  work  is  being  done  in  this  area 
but  clearly  not  enough.  An  extremely 
important  research  direction  for  the  Army 
to  pursue  is  to  develop  a  set  of  common 
measurable  milestones  for  SLCM  projects. 
Each  milestone  must  have  associated  with 
it  relatively  easily  measured  attributes 
of  the  products  delivered  at  the  end  of 
the  activity  it  stands  for. 

Furthermore,  we  must  all  realize  that 
progress  of  software  through  its  life 
cycle  is  not  measured  by  time,  but  by 
accomplishments.  Thus,  there  must  be  some 
description  of  measurable  criteria  with 
which  to  determine  physical  evidence  of 
progress.  These  milestone  accomplishment 
critera  must  be  complete  enough  and  in 
sufficient  detail  that  the  resulting 
measurements  will  be  sufficient  to 
generate  relevent  management  decisions. 


Postscript 

Federal  ADP  Reorganization  Study 

The  workshop's  featured  speaker,  Mr. 
E.  Larry  Dreeman,  reported  on  the  findings 
of  the  Federal  ADP  Reorganization  Study 
National  Security  Team  which  he  chaired. 
Mr.  Dreeman' s  team  was  chartered  to 
investigate  ADP  activities  in  the 
Department  of  Defense.  This  was  a  portion 
of  one  of  31  Federal  government 
reorganization  projects  initiated  by 
President  Jimmy  Carter.  The  overall 
project  study  objective  was  to  investigate 
ADP  in  the  Federal  government  and  to 
recommend  inprovements  in  the  government's 
use  of  information  technology; 
specifically,  to  improve  delivery  of 
government  service,  improve  acquisition 
management  and  use  of  information 
technology  and  eliminate  duplication  and 
overlap. 

Mr.  Dreeman' s  presentation  evoked  a 
very  active  response  from  the  audience  as 


it  appeared  that  many  of  the  findings  and, 
in  particular,  the  reconmendations  were 
quite  controversial.  After  the 
presentation,  I  had  the  opportunity  to 
discuss  several  aspects  of  the  study  with 
Mr.  Dreeman  in  more  depth  that  led  to  his 
request  for  a  formal  input  to  the  study. 
A  summary  of  my  personal  response  to  this 
request  is  outlined  below  for  the  benefit 
of  those  who  attended  Mr.  Dreeman 's 
presentation. 

Central  Problem 

My  18  years  of  involvement  in  the 
military  computer  resource  management  area 
has  led  me  to  conclude  that: 

The  primary  cause  of  many  of 
today's  Department  of  Defense  ADP 
management  problems  involves  the 
obsolescence  of  the  centralization  of 
ADP  management  control  policies. 

This  does  not  mean  that  1  do  not  agree 
that  there  should  be  centralized 
policymaking  and  guidance  to  prevent  chaos 
in  this  8-10  billion  dollar  a  year 
Department  of  Defense  business  area.  The 
distinction  1  make  is  between  policymaking 
and  control. 


Recommended  Solution 

1  recommend  that  policymaking  for 
technical  issues  continue  to  be 
centralized  in  the  form  of  a  Department  of 
Defense  ADP  "focal  point"  to  serve  as  an 
interface  with  other  Agencies,  the  General 
Services  Administration,  the  Office  of 
Management  and  Budget  and  the  Congress. 
In  addition  to  the  single  Interface 
function,  the  focal  point  will  develop 
policies  and  guidance  in  the  areas  such  as 
the  following: 

a.  Standardized  methodologies 
for  the  acquisition  of  automatic  data 
processing  equipment  (ADPE) . 

b.  Guidance  with  respect  tc 
standardization  of  computer  programming 
languages  for  use  within  the  Department  of 
Defense. 

c.  Guidance  and  standardizatior 
of  computer  and  peripheral  interfaces 
within  the  Department  of  Defense. 

d.  Standardization  of  data 
elements  and  codes  used  within  the  ADF 
environment  in  the  Department  of  Defense. 

The  problem  with  the  centralization 
scheme  as  it  exists  today  is  that  the 
authority  for  purchasing  computers  and 
associated  software  is  retained  at 
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excessively  high  levels.  This  hinders 
effective  equipment  and  software 
replacement  programs  designed  to  prevent 
the  massive  hardware  obsolescense  problem 
accurately  reported  by  the  National 
Security  Team  in  its  "Draft  Report" 
(Reference  5) . 

Thus,  I  strongly  recommend  that: 

The  authority  to  purchase 
computers  and  software  should  be 
decentralized  to  the  maximum  extent 
possible  subject  to  centralized 
technical  standardization  policies. 

Furthermore,  the  purchasers  of 
ADPE  and  commercially-available 
software  in  the  DoD  should  use  a 
"systems  approach"  similar  to  that 
used  for  the  acquisition  of  weapon 
systems  hardware  and  software  using 
the  Armed  Services  Procurement 
Regulation  and  as  further  defined  in 
Air  Force  Regulation  800-14  (Reference 
6) . 

This  means  that  DoD  functional 
managers  should  have  more  control  over  the 
type  and  quality  of  ADPE  and  software  that 
is  used  within  their  areas  of  expertise 
and  jurisdiction. 

Relationship  of  ADP  Management 
to  ECS  Management 

We  must  recognize  that  the  reason  for 
the  success  of  recent  (I  stress  the  word 
recent)  defense  system  procurements 
involving  embedded  computer  hardware  and 
software  lies  not  in  the  use  of  the 
"excluded"  special-purpose,  militarized 
computers  but  in  the  perspective  of  the 
program  manager  who  is  developing  a  system 
which  is  not  an  ADP  system  but  rather  an 
aircraft,  tank,  missile  or  spacecraft. 
This  same  embedded  computer  system 
philosophy  can  be  profitably  applied  in 
the  ADP  environment. 

For  example,  any  DoD  personnel  manager 
should  treat  his  ADP  equipment,  computer 
programs,  supplies,  people,  computer  data, 
and  all  other  computer  resources  as  but 
component  parts  of  his  "personnel  system." 
The  ADP  manager  should  no  longer  be 
permitted  to  unilaterally  dictate  what 
specific  types  of  equipment  are  necessary 
and  allowable  for  the  personnel  manager  to 
perform  his  function.  A  more  effective 
role  for  the  ADP  "single  manager"  should 
consist  of  providing  the  personnel  manager 
and  other  line  and  staff  managers  with 
technical  advice  and  service  as  requested. 


Arguments  Pro  and  Con 

Critics  of  the  above  "systems 
approach"  to  ADP  management  state  that  we 
would  soon  return  to  chaos  in  the  ADP 
environment  if  we  dismantle  the  current 
strict  review  process  which  proceeds  from 
the  lowest  levels  all  the  way  up  through 
the  General  Services  Administration  and, 
incidentally,  takes  nine  months  to  a  year 
and  a  half  for  approvals  for  equipment 
over  $50,000. 

I  contend  that  realistic  and  well 
thought  out  policies  developed  by  the 
centralized  ADP  policy  officials  can  be 
enforced  by  existing  Inspector  General  and 
auditor  organizations  within  the 
Department  of  Defense.  It  is  their  job  to 
make  sure  that  all  policies  dictated  from 
higher  authority  are,  in  fact,  carried  out 
at  the  lowest  levels  of  all  military 
organizations.  Therefore,  the  technical 
policies,  such  as  the  use  of  specific 
higher  order  languages  and  standard  data 
elements  and  codes  can  be  easily  checked 
for  compliance  during  the  normal  visits  to 
DoD  installations  by  these  enforcement 
agency  representatives. 

With  respect  to  the  argument  that  the 
costs  could  go  out  of  control  if  this 
centralized  authority  is  not  retained  at 
the  highest  levels,  I  contend  that  the 
budgetary  process  itself  will  hold  down 
costs,  probably  to  a  greater  degree  than 
one  can  achieve  with  centralized 
management  control.  The  reason  for  this 
is  that  the  functional  line  or  staff 
manager  receives  one  slice  of  the  budget 
pie  as  his  total  share  for  any  given 
fiscal  year.  If  it  can  be  proven  to  him 
that  a  new  piece  of  ADPE  can  reduce  costs, 
improve  organizational  efficiency, 
increase  functional  system  performance,  or 
has  any  other  purpose  that  is  worth 
spending  some  of  his  limited  monetary 
resources  on,  he  will  probably  approve  it. 
However,  all  functional  managers  within 
that  commander's  organization  are  also 
trying  to  get  their  own  pieces  of  the  same 
pie  slice.  This  adversary  situation  with 
respect  to  the  budgetary  process  is  normal 
in  all  areas  at  the  present  time  to  the 
best  of  my  knowledge,  except  for  ADPE.  I 
believe  that  if  ADPE  authority  for 
purchase  is  decentralized  to  the  lowest 
functional  levels  possible,  the  normal 
budgetary  process  will  be  much  more 
effective  in  reducing  overall  ADP  costs 
than  does  the  present  highly  centralized 
management  control  system. 

Finally,  with  respect  to  the  potential 
problem  of  not  having  expert  help  if  there 
is  not  a  strong  central  technical 
organization  that  retains  the  authoritv  tc 
pass  Judgment  on  ADP  purchases,  I  submit 
that  a  very  effective  organization  already 
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exists  to  help  users  with  this  function, 
the  Federal  Simulation  Center  (FEDSIM) . 
This  agency  can  be  called  in  by  any 
functional  manager  in  the  Department  of 
Defense  as  a  consultant  to  help  them  with 
the  technical  problems  of  performance 
evaluation,  monitoring,  equipment 
tradeoffs,  make  or  buy  decisions,  source 
selection,  competitive  procurement,  and  so 
forth. 

Postscript  Summary 

In  summary,  the  three  basic  points  I 
will  be  making  to  the  National  Security 
Team  Chairman,  Mr.  Dreeman,  are  as 
follows : 


a.  A  serious  problem  with  ADP 
management  in  the  Department  of  Defense  is 
the  obsolescense  of  Brooks  Act 
implementation  policies. 

b.  I  recommend  that  ADPE 
technical  issue  policymaking  be 
centralized  to  conform  with  the  intent  of 
the  Congress  for  single  focal  point 
cognizance  and  reporting. 

c.  I  recommend  decentralizing 
the  authority  to  purchase  ADPE  and 
software  such  that  functional  managers  are 
given  the  authority  to  develop  their  ADP 
resources  just  as  they  develop  all  other 
resources  at  the  present  time. 

Thus,  I  do  not  believe  there  are  any 
compelling  reasons  for  continued 
centralization  for  ADPE  control  at  the 
highest  levels  of  DoD  management,  and 
strongly  request  that  the  final  National 
Security  Team  recommendations  to  President 
Carter  be  modified  to  incorporate  these 
views . 


In  Conclusion 


The  Dynamics  Theory  Group  lists  the 
following  conclusions  and  recommendations 
as  those  deserving  further  consideration 
by  the  Army  in  its  attempt  to  improve 
software  life  cycle  management: 

1.  Appropriate  management  methods 
that  are  applicable  to  different  phases  of 
the  life  cycle  do  not  vary  across 
different  classes  of  software.  However, 
the  specific  management  method  used  can  be 
and  usually  is  different  for  different 
phases  of  the  software  life  cycle. 

2.  We  recommend  that  our  proposed 
software  life  cycle  management  model  shown 
in  Figure  1  be  used  as  a  strawman  baseline 
for  a  follow-on  research  effort  by  the 
Army  to  add  conceptual  detail  to  the 
individual  life  cycle  subphases. 


3.  We  feel  that  additional  emphasis 
should  be  placed  on  the  management 
activities  in  the  Deployment  Phase  of  the 
software  life  cycle,  especially  with 
respect  to  the  problems  of  transfer  of 
development  management  responsibility  to 
the  support  management  team.  We  recommend 
that  the  Army  follow  up  on  this  point  with 
further  research  to  develop  an  effective 
system  (software)  deployment  strategy. 


4.  We  recommend  that  the  Army  focus  a 
significant  research  effort  on  the 
critical  problem  of  trying  to  determine 
the  point  when  software  "fine  tuning"  is 
sufficiently  complete  to  be  able  to 
confidently  transfer  a  system  to  the 
Support  Phase  of  its  life  cycle. 

5.  The  planned  cost  of  software 
embedded  in  a  system  should  terminate  with 
the  initial  entry  of  the  system  into  the 
System  Modification  subphase  of  the 
Support  Phase  of  its  life  cycle. 

6.  The  answers  to  the  questions  of 
whether  or  not  ECS  or  ADP  systems  require 
different  types  of  management  practice  or 
different  types  of  life  cycles  are  simply 
yes  and  no,  respectively. 

7.  Milestones  should  represent  the 
termination  of  specific  activities  or 
tasks  and  also  provide  a  measure  of  the 
degree  of  completeness  (or  quality)  of 
those  activities.  Hence,  they  must  be 
quantitative  to  be  useful  to  management 
for  judging  the  progress  of  a  system 
through  its  life  cycle.  If  they  are  not 
quantitative  in  nature,  they  simply  cannot 
adequately  support  this  necessary 
measurement  function. 


8.  An  extremely  important  research 
direction  for  the  Army  to  pursue  is  to 
develop  a  set  of  common  measurable 
milestones  for  SLCM  projects.  Each 
milestone  must  have  associated  with  it 
relatively  easily  measured  attributes  of 
the  products  delivered  at  the  end  of  the 
activity  it  stands  for. 
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INTRODUCTION 

Seven  position  papers  are  given  here  that  re¬ 
flect  the  practice  and  experience  of  each  partic¬ 
ipant  in  the  dynamics  of  software  life  cycle 
management.  On  the  basis  of  his  past  contribut ions 
to  the  field,  each  participant  was  invited  by  the 
U.S.  Army  Institute  for  Research  in  Management 
Information  and  Computer  Science  (AIRMICS) .  Along 
with  the  position  papers,  the  participants  dealt 
extemporaneously  with  a  list  of  questions  offered 
by  the  AIRMICS  technical  chairman.  Dr.  Victor 
Basil i.  Three  of  the  questions  were  selected  for 
group  discussion  as  having  the  most  interest  at 
this  time  and  the  greatest  potential  leverage  in 
reducing  cost  and  risk  for  future  AIRMICS  projects. 
The  findings  in  these  three  areas,  management 
dynamics,  software  tools,  and  life  cycle  mainte¬ 
nance,  are  summarized  here.  The  participants  for 
this  session  of  AIRMICS  78  are: 

George  J.  Schick,  University  of  Southern 

Cal ifornia 

Barbara  C.  Stewart,  Honeywell  Systems 

Harold  S'- one.  University  of  Massachusetts 

Ivan  J.  Jaszlics,  Martin  Marietta,  Denver 

Gerald  Weinber,  Ethnotech,  Inc. 

Ray  W.  Wolverton,  TRW  Systems  Group 

Kenneth  Kolence,  Institute  for  Software 

Engineering 

In  addition,  all  attendees  (approximately  50)  par¬ 
ticipated  in  generating  answers  to  the  three 
questions  chosen  bv  the  group  (i.e.,  participants 
plus  a 1 1 eridces )  . 

OVERVIEW 

R.  C.  McHenry  and  J.  A.  Rand  of  TBM  contrib¬ 
uted  a  position  paper,  although  circumstances 
prevented  its  oral  presentation.  They  believe  that 
the  very  nature  of  top-down  development  allows  It 
to  become  a  powerful  tool  and  technique  for  system 
Integration,  thereby  leading  to  earlier  and  more 
romplete  system  readiness  than  would  otherwise  bo 
possible.  Their  key  point  is  that  by  Incremental 
development  and  testing,  a  new  discipline  is 


possible  (termed  transition  management)  that  - 
properly  implemented  -  does  not  require  added 
development  time  for  testing.  This  is  an  extreme¬ 
ly  powerful  hypothesis  for  future  AIRMICS  study. 

G.  J.  Schick  and  C.  Lin  of  USE  show  quantita¬ 
tive  techniques  for  determining  the  manager's 
preference  for  prior  distributions  that  are  needed 
in  software  reliability  models  using  Bayesian 
probability  theory.  Predictive  reliability  models 
assist  the  manager  is  estimating  the  number  of 
errors  indigenous  to  the  software  system  under 
development  and  the  amount  of  time  required  to 
reduce  the  indigenous  error  population  to  an 
acceptably  low  level.  Their  solution  lies  in  an 
automated  question  and  answer  dialog  with  the 
practitioner  to  find  his  level  of  indifference  to 
alternatives  that  imply  statistical  fractiles.  In 
this  way  mathematically  tractable  estimates  of 
error-reduction  can  be  made  that  incorporates  the 
practi tioner ' s  software  experience  and  intuition. 

One  of  the  stated  assumptions  is  that  soft¬ 
ware  development  is  not  a  branch  of  mathematics 
but  rather  a  special  form  of  communication,  person 
to  person  and  person  to  machine.  B.  C.  Stewart  of 
Honeywell  Systems  offers  a  new  discipline  for 
alleviating  the  intrinsic  difficulties  of  communi¬ 
cation,  particularly  early  in  the  design  process, 
that  combines  an  analysis  model  and  analytical 
procedures.  Her  methodology  has  the  benefits  of 
assuring  that  both  design  goals  and  organizational 
goals  are  met,  providing  a  means  to  evaluate  the 
cost  effectiveness  of  the  organi zat ion ' s  design 
methodology;  and  establishing  a  measn  bv  which 
differing  design  methodologi os  can  be  quantita¬ 
tively  compared. 

Harold  Stone  of  the  University  of  Massa¬ 
chusetts  and  Aaron  Coleman  of  the  U.S.  Armv 
(CORADCOM)  report  on  their  hardware/software  life- 
cycle  model  that  measures  the  cost  of  stan¬ 
dardizing  the  computer  instruction  set  together 
with  the  support  tools  for  the  military  computer 
family  (MCF) .  Their  results  show  that  the  CYK-41 
(PDP-11),  out  of  the  set  of  four  semi f inal 1st s  in 
the  MCF  study,  ranks  as  the  best  choice  for  the 
MCF  under  their  criteria  of  comparison.  They  use 
a  22-vear  interval  for  acquisition  and  deployment 
of  candidate  MCF  computers  and  support  software  - 
1980,  1985,  anu  1990  -  with  each  lot  deployed  for 
10  years.  Their  model  is  successful  in  identi¬ 
fying  the  critical  cost-drivers  and  in  estimating 
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their  relative  importance,  although  it  is  not 
intended  to  predict  dollar  costs  with  accuracy.  A 
crucial  factor  in  the  life-cycle  cost  analysis  is 
that  the  greater  the  value  of  the  software  tool 
base,  the  lower  the  cost  per  line  of  applications 
code.  Their  model  estimates  the  tool  value  as  a 
function  of  time  to  reach  time  varying  estimates 
of  productivity.  Their  analysis  shows  that  the 
GYK-41  (PDP-11)  offers  a  cost  savings  of  $1.5 
billion  over  the  next  lowest  contender. 

C.  E.  Valez  and  Ivan  J.  Jaszlics  of  Martin 
Marietta,  Denver,  present  a  position  paper  on  use¬ 
ful  evaluation  tools  in  the  design  process.  Their 
hypothesis,  supported  by  initial  experience,  is 
that  design  languages  are  emerging  for  identfying 
requirements,  design  components,  and  design  speci¬ 
fications  on  the  basis  of  which  coding  can 
commence.  They  believe  that  several  design 
languages  apply  at  different  levels  of  the  design 
process.  One  of  the  main  purposes  of  a  given 
design  language  is  to  provide  the  human  the 
capacity  for  interaction  beyond  the  first  available 
solution  to  the  best  solution  for  his  requirements. 
Two  often  neglected  phases  in  a  design  language 
approach  are  included  in  their  potential  solution: 
the  definition  of  the  man-machine  interface  and 
computer  resource  requirements.  They  believe  that 
an  integrated  software  design  concept  is  essential 
to  comprehensive  definition  of  the  system  devel¬ 
opment  interfaces.  The  importance  of  program 
design  languages  for  the  upcoming  generations  of 
software  cannot  be  over-emphasized. 

Gerald  Weinberg  of  Ethnotech,  Inc.,  expresses 
his  views  on  why  the  expected  gains  from  program¬ 
ming  tools  have  been  slow  in  arriving,  and  often 
disappointing  when  they  do  arrive.  He  believes 
that  the  problem  lies  in  the  failure  to  understand 
the  processes  by  which  new  technology  is  intro¬ 
duced.  The  role  of  training  has  been  left  by 
default  to  computers,  under  the  assumption  they  are 
better  or  cheaper  than  human  teachers.  His 
solution  is  to  provide  an  overall  climate  for 
professional  learning  in  which  both  the  computer 
has  a  role  and  the  human  teacher  have  a  role. 
Probably  all  the  tools  needed  to  solve  the 
elaphant's  share  of  the  software  development  pro¬ 
blems  have  already  been  created,  at  the  cost  of 
per  .aps  a  billion  dollars  a  year,  a  micro- 
organismic  sum  of  money  has  been  spent  on  training 
people  to  use  those  tools.  This  means  the  practi¬ 
tioner  does  not  use  the  tools  accessible  to  him. 

We  have  spent  billions  for  "tools",  but  not  even 
pennies  on  understanding  what  is  needed  to  create 
the  professional  technical  leaders  who  will  use 
them.  The  answer  he  advocates  is  to  take  computer 
training  out  of  the  realm  of  computers  and  put  it 
In  the  brains  of  people. 

R.  W.  Wolverton  and  B.  W.  Boehm  summarize  the 
more  important  lessons  learned  in  developing  a 
cost  model  for  TRW.  The  kev  issues  are  first  a 
need  to  develop  agreed-upon  criteria  for  the  value 
of  a  software  cost  model,  second  a  need  to  evaluate 
existing  and  future  models  with  respect  to  these 
criteria,  and  third  a  need  to  emphasize  construc¬ 


tive  models  that  relate  their  cost  estimates  to 
software  phenomenology  and  project  dynamics.  A 
potential  solution  to  these  needs  is  given  by  nine 
criteria  defining  the  goodness  of  a  software  cost 
model.  With  respect  to  the  emphasis  on  dynamics 
of  the  AIRMICS  workshop,  the  structural  form  of 
the  cost  model  can  be  used  at  RFP  time  (e.g.,  it 
does  not  ask  for  input  data  available  only  after 
design)  and  at  all  subsequent  development  cycle 
reviews.  As  more  and  more  verifiable  input  data 
become  available  they  replace  the  estimates,  un¬ 
known  data  estimates  are  made  current,  and  the 
manager  can  see  the  incept ion-to-date  information 
on  the  estimated  cost  to  complete  and  time  to 
complete.  Each  cost  element  can  be  traced  to  a 
work  unit  in  the  work  breakdown  structures  (WBS), 
and  the  manager  can  readily  spot  the  trouble 
areas  through  management  by  exception  techniques. 

A  tie-in  to  the  WBS  is  essential  to  show  what  is 
(and  is  not)  included  in  the  resulting  cost 
estimate. 

Kenneth  Kolence,  president  of  the  Institute 
for  Software  Engineering,  stands  by  his  position 
that  the  field  has  advanced  to  the  point  where  we 
now  have  a  discipline  of-software  physics.  What 
is  needed  to  tap  this  resource  is  an  understanding 
of  what  this  means  relative  to  work  performed,  the 
capacity  to  take  on  new  work,  and  the  use  of  the 
metrics  now  observable.  One  then  organizes  the 
work  of  software  design  around  a  forecast  for  the 
use  of  facilities.  The  problem  regarding  soft¬ 
ware  acquisition  scheduled  for  the  mid  1980’s  is 
to  define  in  1978  what  data  is  really  needed, 
collect  it,  analyze  it  by  the  laws  of  software 
physics,  and  incorporate  it  into  an  action  plan. 

MANAGEMENT  DYNAMICS 

Any  person  who  influences  the  irreversible 
use  of  resources  is  in  truth  a  manager  whether 
his  title  is  that  or  not.  Dynamics  deals  with 
forces  and  their  relation  primarily  to  motion 
or  time  behavior  but  sometimes  also  to  the 
equilibrium  of  the  acts  of  management.  The  word 
also  implies  patterns  of  change  of  growth.  There¬ 
fore,  management  dynamics  refers  to  the  forces 
inherent  in  the  process  of  leadership  and  their 
interplay  as  a  function  of  Intensity  and  fre¬ 
quency. 

Startup  Conditions 

Software  management  has  the  same  major 
components  of  organization  and  enactment  as  the 
successful  management  of  any  other  complex  human 
endeavor,  with  two  significant  observed  excep¬ 
tions.  First,  managers  do  not  generally  recog¬ 
nize  that  the  software  development  process  has 
more  degrees  of  freedom  than  another  project  of 
equilivent  dollar  value.  This  unique  quality 
accentuates  the  recurrent  people-oriented 
difficulties  associated  with  span  of  control 
through  not  knowing  preeiselv  what  to  control,  by 
what  criteria,  at  what  timely  milestone  events. 

Senior  managers  are  now  in  decision-making 
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roles  by  a  traditional  reward  system  based  on 
earlier  career  successes,  where  success  is  defined 
by  the  manager's  management.  Often,  the  recog¬ 
nition  is  based  on  earlier  projects  for  which  the 
software  component  was  smaller  and  the  criteria 
of  success  not  clearly  focused.  These  early  pro¬ 
jects  were  often  managed  through  intuitive,  and 
undescribable ,  methodologies.  Inevitably,  this 
situation  is  accompanied  by  schedule  slippage, 
cost  overrun,  and  low-quality  field  software. 

To  further  aggravate  this  condition,  these 
(now)  older  in  situ  managers  are  overtaken  by 
events  in  the  form  of  accelerating  advances  in 
computer  technology  and  still  more  degrees  of  free¬ 
dom,  Increasingly  difficult  management  decisions 
(shifting  from,  say,  two-valued  deterministic 
choices  to  multi-valued  probabilistic  choices), 
and  the  conflict  created  by  the  need  for  "detach¬ 
ment"  for  the  sake  of  overall  visibility  on  the 
one  hand,  and  the  need  for  "involvement"  for  the 
sake  of  in-depth  understanding  on  the  other  hand. 

If  the  top-level  manager  does  not  understand  the 
complexity  of  the  dynamics  of  the  software  devel¬ 
opment  process  he  is  in  difficulty.  What  a 
manager  does  not  understand  he  cannot  manage.  His 
recourse,  often  adopted  unconsciously ,  is  to  move 
from  active  management  to  reactive  administration. 
He  is  now  driven  by  events,  and  the  otherwise 
manageable  project  becomes  unmanageable. 

Paradoxically,  too  much  "understanding"  of 
the  problem  to  be  solved  gives  rise  to  too  many 
good  ideas  for  its  solution.  This  can  lead  to 
expensive  gold-plating  at  one  exteme  or  paralysis 
and  delay  at  the  other.  Moreover,  always  under¬ 
lying  the  management  of  software  is  that  software 
tends  to  be  invisible  unless  made  visible. 

Second,  managers  do  not  generally  recognize 
that  a  software  project  is  of  the  same  nature  as  a 
comparably  sized  non-software  project.  Software 
implementation  needs  a  capital  base  for  estab¬ 
lishing  an  overall  professional  working  climate, 
i.e.,  modern  computing  facilities,  compilers, 
operating  systems,  effective  support  tools,  trained 
people,  and  a  support  group  Including  configuration 
and  data  management  specialists.  The  work  is  done 
by  human  beings,  not  machines.  Some  managers  do 
not  acknowledge  that  software  is,  in  fact,  manage¬ 
able,  and  this  leads  to  self-fulfilling  prophecy 
that  it  is  not.  Software  is  a  thing,  a  product, 
an  asset,  it  is  as  real  as  hardware  and  can  be 
managed . 

Ideal  Management  Circumstances 

The  Ideal  ingredients  of  software  management 
can  be  identified  unambiguously .  However,  it 
appears  that  some  things  are  so  obvious  as  to  be 
overlooked  or  not  applied  when  they  are  easily 
within  the  manager's  grasp.  Management  should 
accept  and  define  the  problem  and  set  team  goals. 

He  should  not  accept  fuzzy  or  ill-defined  project 
requirements;  other  non-software  technical  dis¬ 
ciplines  would  reject  many  Jobs  readily  accepted 
by  software  managers.  The  most  crucial  step  is 


for  the  manager  to  properly  subdivide  the  problem 
into  manageable-sized  packages  of  work  and  assign 
clear  lines  of  authority  and  commensurate  respon¬ 
sibility. 

He  must  assign  tasks  to  individuals  (e.g., 
sub-project  managers)  and  set  individual  goals. 

One  way  to  do  this  is  to  define  subordinate 
objectives  that,  in  turn,  support  senior  objec¬ 
tives.  The  manager  assigns  a  measurable  task  to 
an  individual  for  completion  by  a  certain  date  at 
a  cost  not  to  exceed  so  many  dollars.  Then  the 
manager  can  constructively  monitor  and  assist  the 
work  objectives  by  periodic  reviews  organized  to 
compare  outputs  against  objectives,  including  cost 
and  schedule  predictions  versus  actuals.  A  work 
breakdown  structure  is  a  proven  technique  for 
helping  to  avoid  ambiguity.  He  has  existing  tools 
to  aid  him  in  doing  an  effective  job. 

Management  Aphorisms 

Here  is  a  collection  of  aphorisms  put  forth 
by  this  group  during  the  AIRMICS  78  workshop. 

a)  Ensure  that  common  standards  apply  to 
all  parts  of  a  project.  Ensure  that 
the  interfaces  between  modules  are 
managed  at  a  high  enough  level  for  the 
consequences  of  any  change  to  be 
appreciated.  No  one  can  enforce  an  order 
that  the  consequences  of  a  change  be 
appreciated. 

b)  Simple  projects  can  be  managed  by  a 
traditional  "scalar  chain"  line-staff 
relationship.  Complex  projects,  which 
are  more  the  norm,  require  staffing  from 
different  disciplines.  A  matrix  organ¬ 
ization  is  required.  Then  conflicts 
ensue  between  line  and  project;  this  is 
called  divine  discontent. 

c)  Within  a  matrix  organization  the  emphasis 
in  technology  will  shift  over  the  use¬ 
ful  life  of  the  software.  The  manager’s 
management  should  have  a  written  plan  to 
change  the  organization  to  suit  the 
demands  of  the  software  evolution.  If 
the  plan  is  not  written  down  it  does  not 
exist . 

d)  A  manager  should  have  a  technical  back¬ 
ground  and  explicit  training  in  manage¬ 
ment  skills.  The  manager  should  be 
relieved  of  the  requirement  that  he  is 
the  most  technically  competent.  He 
should  have  the  intrinsic  ability  to 
motivate  and  develop  loyalty.  This 
circumstance  now  meets  the  criteria  of 
Murphy's  Law. 

e)  The  Ideal  manager  has  superb  management 
training,  but  nobody  notices. 

f)  Everything  good  happens  early.  Unmanaged 
projects  do  not,  unfortunately,  have  a 
high  infant  mortality  rate.  To  bring 

an  unmanaged  project  under  control 


requires  changing  of  the  project  manager, 
revising  the  project  plan,  developing  a 
new  schedule  and  budget,  and  revising 
objectives.  Otherwise,  there  is  nothing 
to  it. 

g)  If  you  are  a  good  leader  who  talks  little, 
they  will  say,  when  your  work  is  done  and 
your  aim  fulfilled,  "We  did  this  our¬ 
selves"  (Lao-Tse).  This  view  does  not 
succeed  east  of  Los  Angeles. 

h)  Few  programmers  become  major  officers  of 
the  company.  It  is  probably  true  that 
programmers  have  a  less  clearly  defined 
career  path  than  most  professionals. 

i)  To  sell  an  idea  to  management,  make  sure 
management  thinks  of  it  first.  There  is 
nothing  that  cannot  be  accomplished  if 
one  doesn't  care  who  gets  the  credit.  The 
fact  is  that  everybody  does  care  who  gets 
the  credit. 

j)  No  project  will  succeed  if  the  energy  is 
directed  toward  placing  blame.  One  can 
find  out  if  a  project  is  beginning  to  be 
in  trouble  by  asking  the  secretary  who  in 
the  project  is  building  a  white  file. 

k)  Hardware  is  built  from  documentation. 
Software  is  built  and  then  documented. 

This  documentation  is  often  for  an  earlier 
version  than  "as  built."  Matching  the 
documentation  to  the  software  as  delivered 
is  a  management  goal,  frequently  unreal¬ 
ized  . 

11  Decisions  regarding  hardware  design  and 
implementation  are  nearly  irrevocable, 
whereas,  the  software  manager  can  naively 
operate  from  the  false  premise  that  he  can 
correct  faulty  decisions  at  a  much  later 
stage  in  the  development  process.  He  gets 
to  have  this  attitude  for  one  project 
only, 

m)  Some  human  behavior ists  who  think  about 
"management  success"  are  grounded  in  the 
view  of  Machiavelianism,  especially  the 
view  that  politics  is  amoral  and  that  any 
means  however  unscrupulous  is  justifiable 
in  achieving  political  power.  Many 
successful  managers  become  politicians  and 
vice  versa. 

SOFTWARE  TOOLS 

A  tool  is  something  necessary  in  the  practice 
of  a  vocation  or  profession,  as  a  scholar's  books 
are  his  tools.  This  section  gives  an  overview 
and  minimum  detail  on  the  wide-ranging  concerns  of 
this  group  on  software  tools. 

Table  I  shows  a  few  up  to  date  software  tools 
for  typical  development  cycle  stages.  Literally 
hundreds  more  exist.  If  one  views  maintenance, 
especially  enhancement  maintenance,  as  a  series  of 
mini-development  cycles,  one  can  apply  essentially 
the  same  tools  to  the  sequence  of  planned  enhance¬ 
ments  during  maintenance.  Under  these  conditions 


a  given  tool  may  have  a  useful  lifetime  of  10  or 
more  years.  Although  this  group  did  not  consider 
the  quantitive  aspects,  say,  of  the  breakeven  cost 
for  tool  development,  a  plausible  case  will  be 
made  for  the  cost  effectiveness  of  soundly  con¬ 
ceived  (and  transportable)  tools  based  on  group 
experience. 

Croup  Observations  About  Existing  Tools 

The  key  goal  regarding  the  future  of  manage¬ 
ment  information  tools  is  increased  visibility  by 
the  project  manager  at  any  time.  This  goal 
implies  walking  to  a  local  terminal  and  getting 
project  actual  versus  planned  expenditures, 
estimates  of  time  and  cost  to  complete,  and  other 
statusing  indicators  especially  management  by 
exception  indicators,  for  example,  the  cost  from 
inception  to  date  for  a  particular  module  ex¬ 
ceeding  a  predetermined  threshold  such  as  10 
percent,  within  a  certain  period. 

Four  steps  are  involved  in  reaching  this 

goal: 

a)  Clearly  define  the  management  infor¬ 
mation  process  in  an  organization. 

b)  State  requirements  for  tools. 

c)  Implement  tools  starting  with  the  areas 
with  the  highest  payoffs. 

d)  Audit  existing  tools  for  current  effec¬ 
tiveness.  Identify  candidates  for 
replacements . 

One  premise  is  that  the  managment  process 
is  driven  by  the  product  structure.  Management 
tools  must  be  unified  with  product  description 
tools.  The  tools  must  be  interactive  in  the  sense 
that  task  networks  can  be  readily  modified  as  the 
design  developes.  The  task  partitioning  problem 
is  driven  by  the  design  partitioning  problem  and 
the  dynamics  of  circumstances  over  time. 

Project  aggregates  (i.e.,  man  loading  versus 
time  and  modules  complete  versus  time)  should  be 
made  access ible ‘ on  a  periodic  or  demand  time 
basis.  To  meet  this  goal  the  actual  progress 
must  be  compared  against  the  planned  progress. 

The  ability  to  estimate  needed  resources  from  a 
skeleton  design  is  implied. 

Product  design  tools  may  be  categorized  as 
existing  tools  or  advanced.  Existing  or  avail¬ 
able  tools  customarily  include  assemblers, 
linkers,  program  preparation  aids  (editors,  etc.) 
debug  assist  tools  (path  tracers,  etc.)  and  high 
level  language  manipulators  (compilers,  etc.). 
Advanced  tools  are  defined  to  cover  very  high 
level  languages  (automatic-program  generators, 
etc.),  products  to  assit  in  creating  and  debugging 
executive  and  real-time  programs,  products  to 
assist  to  performance  analysis,  and  products  to 
assist  in  verifying  distributed  data  processing 
conf igurat ion. 
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Table  I.  Existing  Software  Tools 


Tool  Name 

function 

Life  Cycle  Phase 

Computer 

where 

Software  Cost 
Estimating 

Program 

Estimates  effort  for  development 
cycle  in  man-months,  by  phase 

Proposal  and  major 
mi  lestones 

Fortran 

CDC  66XX 

TRW 

Software 

Requirements 

Engineering 

Program 

Analyzes  requirements  by 
relational  data  base 

Conceptual 

Pascal 

CDC  7600 

TRW 

Manuscript 

Preparation 

System 

Text  is  entered  to  a  computer 
file  by  remote  terminal  and 
edited 

Conceptual  and  as 
needed 

Compass 

CPC  CYBER  74 

CDC  6000  Series 

TRW 

Program  Design 
Language 

Design  is  written  in  structured 
English  by  six  control  constructs 

Definition 

Fortran 

360/370,  1108, 
6XXX,  DEC-10, 

SEL  932, 

POP  10/20/11 

Ca i ne , 
farber 
and 

Gordon, 
Inc . 

High-Powered 

Accounting 

Resources 

Program 

Provides  graphic  display  of 
scheduling  and  resource 
information 

Definition 

fortran 

6XXX,  Cal  comp 

TRW 

Performance  and 
Configuration 
Analysis  Model 

Represents  systems  by  event 
logic  tree's  before  any  code 
exists 

Definition  and 
development 

Fortran 

CDC  6X00,  7X00 

TRW 

Software  Design 
and  Verification 
System 

Supports  design,  code,  test 
and  maintenance  of  DAIS 
mission  software 

Deve’opment 
(simulation  tool) 

Jovial  J73, 

Cobol ,  Assy, 
fortran 

TRW 

DEC  10 
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Go-Alwjd  T  i  n  i  Operational  Demenstration 

w  ^  Typical  Development  Cycle  Phases  K 


26 


Why  People  Do  Not  Use  Tools 

This  group  believes  that  the  topmost  issues 
is  the  need  to  invest  time  to  educate  and  train 
the  practitioner.  As  viewed  by  the  manager,  the 
payoff  for  investment  of  time  is  not  clear  and  not 
presently  well  documented  or  understood. 

Some  existing  tools  work  poorly  and  bias 
managers  (and  programmers)  against  future  use  of 
any  new  tools.  Programmers  do  not  willingly  learn 
about  the  existence  of  new  tools  accessible  to 
them.  Progressive  change  is  difficult  to  instill. 
Standardized  abstracts  and  ascession  lists  would  be 
helpful  here,  especially  in  taking  away  the  unknown 
amount  of  research  into  the  structure  of  the  tool 
and  its  utility  from  questionable  sources.  Tools 
are  not  easily  transported;  what  worked  well  for 
one  project  may  be  commercially  impossible  for 
another  project  because  the  computer  and  configura¬ 
tion  or  language  is  significantly  different. 

In  sum,  people  do  not  use  tools  because; 

a)  They  do  not  see  a  direct  benefit  to  them, 

b)  They  do  not  understand  the  specific  tool 
and  perceive  a  high  risk  of  failure,  low 
chance  of  success,  or  poor  initial  tool 
behavior  that  could  be  blamed  on  them, 

c)  Management  has  coerced  the  project  as  a 
whole  into  using  a  specific  tool,  despite 
inadequate  training  and  planning  for  its 
introduction.  People  wait  for  somebody 
else  to  be  first. 

d)  They  are  pressed  to  meet  a  difficult 
schedule  and  have  no  time  to  experiment 
with  a  new  tool  or  the  accompanying  new 
techniques. 

e)  They  perceive  that  the  proposed  tool  does 
not  work  at  all  in  their  particular 
environment.  On  the  other  hand,  they  may 
not  know  the  tool  exists. 

In  contrast,  people  do  use  tools  because  they 
see  a  direct  benefit  to  them,  management  encourages 
them  in  various  ways,  they  see  a  good  chance  of 
success  (and  no  alternative  without  it),  it  is  new 
and  exciting,  there  is  good  introduction  and 
training  strategy,  they  have  slack  time  in  which  to 
experiment,  they  are  rewarded  by  management 
measures  for  its  use,  and  the  tool,  is  a  part  of  a 
package  they  use  anyway. 

Intrinsic  Advantage  of  Tools 

In  the  sense  used  here,  tools  should  encourage 
a  standard  approach  to  solving  a  recurrent  problem. 
The  form  of  the  output  should  be  such  that  it 
directly  meets  the  informational  requirements  of 
immediate  and  successive  activities.  For  example, 
a  project  management  tool  should  generate  reports 
that  directly  compare  the  actual  completion  status 
with  the  planned  status  without  the  inconvenience 
of  working  among  multiple  reports. 


performance  by  putting  off  tedious  or  repetitive 
actions  that  can  be  done  by  an  algorthm  into  a 
computer.  The  person  is  then  free  to  look  for 
patterns,  trends,  and  relationships,  analyze 
results,  and  bring  to  bear  his  own  creativity. 

Tools  can  and  should  encourage  a  professional 
attitude  toward  work.  The  individual  can  more 
easily  experiment  and  do  creative  tasks.  He  is 
producing  openness  and  visibility  into  an  other¬ 
wise  invisible  process.  He  can  improve  product 
quality  by  systematically  reducing  indigenous 
errors  and  know  that  he  has  done  so.  In  turn,  the 
manager  sees  an  effective  allocation  of  tasks 
between  humans  and  machines  according  to  quality 
and  then  efficiency  if  quality  has  been  assured. 
More  consistent  results  means  more  manageable 
results . 

Tool  Selection  Criteria 

The  proposed  tool  should  automate  any 
repetitive  part  of  the  programming  or  design 
task,  increase  productivity,  accuracy,  and  improve 
morale . 

The  proposed  tool  should  automate  the  ex¬ 
traction  and  correlation  of  vaiious  data  files 
containing  management  or  technical  information 
about  the  program. 

The  proposed  tool  should  assist  in  formal¬ 
izing  software  development  procedures  to  assure 
a  consistent  management  approach  and  development 
methodology.  A  tool  should  enable  people  to 
operate  at  a  higher  level  of  competence.  Some 
things  only  a  tool  can  do  and  do  well,  e.g., 
indicate  which  branches  of  logic  have  and  have  not 
been  exercised  for  a  given  level  of  testing  and  a 
selected  data  state  vector. 

The  proposed  tool  should: 

a)  Meet  all  known  requirements,  have 
capacity  for  growth,  accommodate  a 
reasonably  wide  application  of  use, 
and  have  a  long  expected  lifetime. 

b)  Be  maintainable,  transferable  (within 
technical  limits  of  the  intended 
environment),  and  adaptable  for  training. 

c)  Fit  well  with  other  tools  already  in  use 
or  planned.  It  should  compare  favorably 
with  the  capability  of  the  tools  it  will 
become  a  part  of. 

d)  Compensate  for  a  deficiency  in  resources 
or  organizational  layout.  It  should 
meet  cost/benefit  criteria  established 
by  group  standards. 

e)  Deal  fairly  with  human  factors,  for 
instance,  by  converting  from  arbitrary 
internal  units  to  external  engineering 
units  understandable  by  a  person. 


Tools  can  and  should  encourage  improved  human 


LIFE  CYCLE  MAINTENANCE 


By  comonly  accepted  practice,  the  software 
life  cycle  consists  of  the  development  phase  and 
the  maintenance  phase  taken  collectively . 

Contract  developers  have  been  known  to  talk  about 
the  "life  cycle"  when  they  really  mean  the  "devel¬ 
opment  cycle. "  This  clarification  must  be  made 
at  once.  A  typical  distribution  of  resources  for 
a  large-scale  software  project  might  be  IOC 
people  for  two  years  in  the  development  phase  and 
35  people  for  eight  years  in  the  maintenance  phase. 

Basic  Maintenance  Circumstances 

In  this  hypothetical  case,  the  development 
phase  exists  from  contract  go-ahead  to  opera¬ 
tional  demonstration  and  government  acceptance  by 
DD  Form  250.  The  maintenance  phase,  by  definition, 
is  then  the  interval  when  everything  that  is  not 
development  happens  to  the  software  package.  Of 
the  480  man-years  hypothesized,  200  man-years  is 
expended  in  development  and  280  man-years  in  main- 
tanance.  Thus  about  40  percent  of  the  life  cycle 
resources  are  given  over  to  the  building  (i.e., 
development)  and  about  60  percent  to  keeping  the 
software  package  in  the  existing  state  of  readi¬ 
ness,  efficiency,  or  validity  (i.e.,  maintenance). 

With  the  majority  of  the  out-of-pocket 
expense  to  the  government  going  into  something 
we  all  call  maintenance  (60  percent  versus  40 
percent),  unquestionably  the  most  important 
factors  in  maintenance  need  to  be  examined 
closely.  Unfortunately,  although  maintenace  is 
the  most  visible  and  costly  phase  it  is  the  least 
documented  and  understood.  In  some  areas  of 
specialization  (GTE-AEL) ,  they  have  a  defined 
life  cycle  of  about  twenty  years  (i.e.,  the  life 
of  a  telephone  switch).  Probably  in  this  example, 
good  records  of  reliability  and  cost  are  avail¬ 
able  but  the  ability  to  generalize  from  the 
specific  is  not  productive  for,  say,  a  command 
and  control  application. 

One  reason  the  industry  does  not  have  good 
answers  to  strategic-planning  issues  in  mainte¬ 
nance  is  that  the  government  writes  a  contract  for 
development  and  then  writes  a  different  (annual) 
contract  for  maintenance,  often  under  a  level  of 
effort  (LOE) .  All  the  thinking  goes  into  devel¬ 
opment  because  the  LOE  type  of  maintenance  is 
managed  by  a  policy  of  sequence  and  priority. 

The  maintenance  contractor,  who  may  or  may  not  be 
the  developer,  accepts  any  and  all  work  required, 
provided  that  each  task  is  listed  (sequenced)  by 
the  customer’s  priority  needs.  Not  a  great  deal 
of  strategic  planning  is  needed  in  this  case, 
since  the  manager  is  reacting  to  his  customer's 
needs  on  a  day  to  day  plan.  Usually  any  task 
appearing  on  the  lower  80  percent  of  the  list  is 
never  completed  (Pareto's  Law). 

However,  there  are  some  things  that  can  be 
aaid  and  some  intelligent  steps  the  contract 
developer  and  the  government  can  take  to  deal 
with  this  circumstance.  The  group  considered 


four  key  concerns  with  varying  levels  of  detail: 
enhancement  maintenance,  introduction  of  a  new 
system  or  repair  of  an  old  system,  design  trade¬ 
offs  for  maintainability,  and  effects  of  maint¬ 
enance  of  development. 

Enhancement  Maintenance 

One  way  to  deal  with  the  question  of  pre¬ 
dicting  and  measuring  the  effective  life  span 
for  a  software  system  is  to  adopt  the  point  of 
view  that  a  software  system  will  be  around  for¬ 
ever.  Let  us  define  "forever"  as  20  years. 

After  the  original  package  is  delivered  to  the 
operational  user,  the  new  point  of  view  is  that 
the  original  version  will  continuously  evolve 
under  a  mini-development  cycle  concept.  Then, 
everything  the  contractor  does  in  development  is 
just  repeated  in  the  enhancement  maintenance 
circumstance. 

Feasibility  studies,  tools,  management  pro¬ 
gress  monitoring,  test  and  acceptance  are  carried 
out  just  as  in  the  original  development.  Except 
now  it  is  harder.  Less  core  space  is  available 
and  more  skill  is  required  to  shoe-horn  in  the 
add-ons.  Arbitrary  style  of  programming  may  have 
been  replaced  by  structured  programming  or  by  a 
program  design  language  so  the  interface  with 
existing  software  is  more  difficult.  Regression 
testing  requires  more  skill,  perhaps  more  soph¬ 
isticated  tools,  and  a  test  rationale  (perhaps 
with  analytic  models)  must  be  available  for 
judging  how  far  back  to  go  with  the  test  pro¬ 
cedures  to  verify  mission  readiness  given  that 
changes  were  made  to  the  previsously  accepted 
code. 

Two  computers  may  be  required,  one  to  use 
online  to  keep  the  operational  program  going  and 
another  to  develop  the  product  improvement  in 
consideration  of  debugging,  system  testing,  and 
operational  readiness  demonstration.  The  main 
thrust  of  the  discussion  is  that  enhancement 
maintenance  can  be  thought  of  an  extension  of  the 
development  process.  The  release  of  product  en¬ 
hancements  is  usually  by  block  release,  i.e., 
work  can  go  on  continuously  but  new  software  is 
introduced  incrementally. 

Introduction  of  a  New  System  (When  to  Redo  an  Old 
System 

If  we  try  to  deal  directly  with  the  question 
of  effective  life  span,  we  are  faced  with  a 
dilemma.  We  believe  the  problem  cannot  be  solved 
in  a  good  or  explicit  way.  The  question  of  tech¬ 
nological  obsolescence  is  not  independent  of 
economic  considerations  such  as  life  time  owner¬ 
ship,  variable  (and  unknown)  user  requirements, 
rapid  new  technology  advances,  and  other  prac¬ 
tical  considerations.  Here,  for  example,  a 
contract  developer  could  not  rationally  allow  a 
budget  percentage  to  cover  maintenance  problems 
of  all  kinds.  He  would  not  be  cost  competitive 
in  his  initial  proposal.  Experience  shows  that 
if  an  equally  qualified  builder  is  more  than 
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10  percent  higher  in  cost  than  the  lowest  bidder 
he  will  most  likely  lose.  He  cannot  realistically 
support  an  argument  that  in  the  long  run  his 
higher  hid  cost  will  prove  to  be  lower  when  con¬ 
sidered  over  the  entire  useful  life  time  of  10  or 
20  years. 

After  a  certain  point  of  growth,  even  with 
modern  virtual  memory  machines,  the  system  will 
have  to  be  redone.  Or,  more  easily  argued  is  that 
the  software  has  to  be  redone  because  the  user 
wants  to  put  an  existing  program  (with  some  en¬ 
hancements)  into  a  completely  new  computer  con¬ 
figuration:  new  language,  new  compiler,  new  data 
storage  and  transfer  methods,  and  revised  proto¬ 
cols  for  real-time  interrupts.  The  question, 
however,  is  why  put  an  existing  and  mission  readv 
software  package  into  a  new  computer  configuration? 
In  a  more  general  sense,  why  are  the  perceived 
attributes  of  the  existing  software  package  not 
equal  to  the  job  at  hand? 

There  are  no  inherent  properties  that  car.  be 
used  to  measure  (predict)  the  life  span  of  a  svs- 
tem.  Given  that  we  can  all  agree  that  the  effec¬ 
tive  lifespan  in  finite,  the  life  span  can  be 
observed  to  bo  over  when: 

a)  The  unit  cost /transact ion  exceeds  the 
projected  unit  cost  of  a  new  svstem  bv 
a  sufficient  margin.  This  estimated 
cost  should  include  re-run  costs  due 
to  low  reliability. 

b)  The  function  that  the  system  serves  be¬ 
comes  obsolete  within  the  sponsoring 
organization.  For  example,  a  plant  mav 
be  closed  and  a  software  svstem  (e.c ., 
process  control)  designed  to  serve  it 
no  longer  serves  any  purpose. 

c)  Defense  measures  require  that  a  hereto¬ 
fore  benign  ground  system  must  he  trans¬ 
portable  and  the  computer  hardware  able 
to  survive  a  particular  nuclear  exposure. 

In  short,  it  is  time  to  redo  the  old  system 
when  it  is  judged  incrementally  cost-effective  to 
recreate  the  svstem  than  to  enhance  it  further. 

This  can  <vciir  because  radical  changes  in  the  fun¬ 
ctional  capability  are  required,  or  because  the 
introduction  of  new  hardware  makes  possible  such 
a  radical  change.  It  can  also  occur  when  the  ef¬ 
fort  required  to  maintain  and  to  enhance  the  ex¬ 
isting  svstem  begins  to  grow  so  rapidly  that  it  is 
cheaper  to  redesign  It.  If  a  suitable  measure  of 
complexity  can  be  found,  a  plot  of  this  measure 
against  svstem  age  is  a  valuable  indicator  of  when 
the  system  is  approaching  a  state  in  which  main¬ 
tenance  or  enhancement  is  no  longer  practical. 

Some  evidence  suggests  that  there  is  an  upper 
limit  to  the  size  of  a  change  which  may  be  made  to 
a  system  at  one  time,  and  that  the  original  struc¬ 
ture  of  the  system  is  a  severe  constraint  on  the 
nature  of  the  functional  changes  which  can  he  ac¬ 
commodated  without  initiating  a  complete  redesign. 


Design  Tradeoffs  for  Maintainability 

Too  many  factors  are  involved  in  designing 
for  maintainabi 1 i tv  to  consider  the  issue  thor¬ 
oughly.  However,  some  isolated  findings  offer 
promise  for  future  AIRM1CS  studies: 

a)  Greater  effort  in  the  beginning  (defini¬ 
tion  and  design)  by  stressing  ease  of 
understanding,  modularity,  and  ease  of 
use  (human  engineering).  This  will  ho 

a  commercially  impossible  goal  unless 
some  reward  system  is  built  into  the 
government  procurement  policy  (i.e.,  an 
incentive  for  doing  more  cost l v  work). 

b)  Emphasis  on  software  quality  assurance 
and  use  of  all  available  techniques  to 
ensure  correctness  early  in  the  software 
life  cycle.  The  more  promising  relia- 
billtv  models  should  begin  to  be  intro¬ 
duced  into  development  test  and  beyond. 

c)  I'se  of  machine  assists  to  detect  pro¬ 
grammer  errors. 

d)  I'se  of  library  maintenance  and  other 
tools  to  assist  in  correct  changes  to  a 
prog  ram . 

e)  Greater  emphasis  on  adaptability.  Exper¬ 
ience  shows  that  large  systems  suffer 
continual  change  in  the  first  12  months 
and  that  after  2  to  S  vears  tvpicallv 
verv  little*  of  the  original  code  is  left 
in  the  svstem. 

i  ts  of  Mu  i  nten.inee  on  Development 

Maintenance  oonstderat  ions  plav  a  role  in  all 
phases  of  development.  At  each  stage  the  product 
of  that  stag**  is  examined  (at  every  design  review, 
code  wa 1 k-t brough ,  unit /string  testing  etc.). 

The  results  .re  reviewed  and  a  decision  made  as  to 
whether  the  stag*  -thou Id  be  reiterated,  realizing 
that  deficiencies  ..crept ed  at  an  earlier  phase 
will  result  in  maintenance  difficulties  later. 

I’erhaps  the  onlv  workable  approach  to  reduc¬ 
ing  the  cost  of  maintenance  is  for  the  government 
to  adopt  a  procurement  policy  in  which  the  govern¬ 
ment  explicitly  acknowledges  maintenance  as  part 
of  the  life  cycle  by  the  kind  of  procurement  con¬ 
tract  applied  in  the  first  place.  Good  procedures 
now  exist  for  ordcrlv  change  procedures  to  be 
built  into  the  contract.  However,  the  implication 
usuallv  is  that  the  sponsor  and  the  contract  de¬ 
veloper  are  in  adversary  roles.  Constraint  will 
be  needed  by  the  government  program  office  in 
keeping  their  requirement  stable,  which  in  turn 
keeps  the  design  and  test  activity  stable  and 
matched  to  Its  necessary  and  sufficient  mission 
demands,  no  more  and  no  less. 

MODI  FT F.D  DELHI! 1  METHODOLOGY 


In  this  group  we  used  an  each-to-all  Delphi 


method  in  working  toward  group  analysis  of  n 
common  problem.  A  position  was  given  orally  by 
each  participant,  and  these  appear  as  a  written 
paper  in  the  next  section.  Where  clarification 
was  needed,  questions  were  asked  about  a  given 
position.  Basically,  more  information  was  being 
transferred  than  there  was  time  to  dig  into  all 
of  it.  Questions  were  posed  and  the  group  nar¬ 
rowed  them  down  to  a  somewhat  manageable  list. 
There  were  many  more  questions  than  answers, 
and  a  fact  probably  worthy  of  notice.  Research 
is  being  applied  to  many  problem  areas,  but  the 
more  problems  that  are  solved  the  more  questions 
are  asked.  Man's  reach  shall  always  exceed  his 
grasp. 

One  small  facet  of  the  interesting  group 
dynamics  that  occurs  when  many  competent  people 
are  in  the  same  room  grappling  with  the  same 
questions  is  given  in  the  next  two  worksheets: 

a)  Pre-discussion  self-rating  sheet  that 
is  concerned  with  the  participant's 
image  of  himself  with  respect  to  his 
peer  group  before  getting  into  substan¬ 
tive  issues.  In  this  case  the  results 
include  this  session,  and  John  Manley's 
Session  T.  The  mean  (x)  and  variance  (s) 
is  indicated  graphically  on  each  question. 

b)  Post-discussion  evaluation  sheet  that  is 
concerned  with  the  participant's  evalu¬ 
ation  of  the  results  of  two  days  of 
group  endeavor  to  answer  questions  of 
interest  to  the  administration  of  the 
AIRMICS  78  workshop.  As  shown  almost  un¬ 
animously  the  participants  see  the  major 
problems  as  people-oriented  and  hardly 

at  all  machine-oriented.  Such  a  con¬ 
clusion  is  probably  not  very  surprising. 
However,  the  worksheets  may  help  the 
reader  to  understand  some  of  the  comments 
rising  from  this  session.  Every  state¬ 
ment  made  cannot  help  but  reflect  the 
cultural  attitude,  subjective  bias,  and 
knowledge  of  each  participant. 
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SOFTWARE  LIFE  CYCLE  MANAGEMENT  WORKSHOP 
Atlanta  -  21  -  22  August  1978 


I.  Pra-Discussion  Saif -Rating  Sheet 


Code  Number 


1. 

As  an  SLCM  participant,  my  skills  in 
software  life  cycle  management  would 
put  me  about  here,  relative  to  others. 

Very  highly 
skilled 

♦  . 

No  skill 

at  all 

12  3  4 

5 

6 

7 

8 

2. 

1  think  my  ideas  are  in  basic  agreement 
with  the  rest  of  the  participants. 

Yes,  Absolutely 

♦ 

No,  not  at  all 

12  3  4 

5 

6 

7 

8 

3. 

1  know  most  of  the  people  in  the  SLCM 
workshop  very  well. 

Yes. 

pretty  well 

. * 

No,  none 
at  all 

12  3  4 

5 

6 

7 

8 

4. 

1  have  some  definite  ideas  about  what  the 
goals  of  the  AIRMICS  SLCM  workshop  are 
and  should  be. 

Yes,  a  lot 

* 

No,  none 

12  3  4 

5 

6 

7 

8 

5. 

1  have  been  in  software  life  cycle  management 
for  longer  than  most  of  the  other  people  here. 

Yes 

. * . 

No 

12  3  4 

5 

6 

7 

8 

6. 

1  have  a  lot  of  experience  in  SLCM  practice 
outside  of  a  university  environment. 

Yes 

.  *  . 

No 

12  3  4 

5 

6 

7 

8 

7. 

1  am  anticipating  that  the  SLCM  conference  is 
going  to  be  a  good  thing  for  goal-setting. 

Yes,  1  think 
it  wilt  be 

♦ 

No,  1  think  it 
may  be  a 
waste  of  time 

12  3  4 

5 

6 

7 

8 

8. 

My  approach  to  problem  solving  for  this  con¬ 
ference  is  best  described  as  "people  oriented." 

Yes,  abwlutely 

No,  not  at  all 

12  3  4 

5 

6 

7 

8 

9. 

My  approach  to  problem  solving  for  this  con¬ 
ference  is  best  described  as  "machine  oriented.” 

Yet,  absolutely 

A- 

No.  not  at  all 

12  3  4 

5 

6 

7 

8 
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SOFTWARE  LIFE  CYCLE  MANAGEMENT  WORKSHOP 
Atlanta  -  21  -  22  August  1978 

Coda  Number 


II.  Post-Discussion  Evaluation  Sheet 


1. 

1  feel  satisfied  with  the 

4r 

_ 

I'm  not  really  happy  with 

results  in  general. 

1  2 

3 

4 

5 

6 

7 

the  results  at  all. 

2. 

1  got  some  ideas  from 

♦ 

1  didn't  learn  a  thing  from 

the  feedback. 

1  2 

3 

4 

s 

6 

7 

the  feedback. 

3. 

In  general.  1  agreed  with 

1  disagreed  with  everything 

the  ideas  in  the  feedback. 

1  2 

3 

4 

5 

6 

7 

in  the  feedback. 

4. 

1  could  express  my  ideas 

_ ± 

1  couldn't  really  express  what 

OK  this  way. 

t  2 

3 

4 

5 

6 

7 

1  wanted  to  say. 

5. 

1  feel  as  if  1  really 

♦ 

1  didn't  feel  the  need  to  talk 

wanted  to  talk  to  people. 

1  2 

3 

4 

5 

6 

7 

at  all. 

6. 

1  think  people  understood 

1  have  a  feeling  people  didn't 

my  reasons  pretty  well. 

. jfc  -- 

_ 

_ _ 

understand  or  think  about  my 

1  2 

3 

4 

5 

6 

7 

reasons. 

7. 

1  think  the  SLCM  con 

1  don't  think  this  SLCM  con 

ference  structure  could 

ference  structure  could  be 

be  operational  in  goal 

1 _ 

_ 

operational  at  all. 

setting  more  generally. 

1  2 

3 

4 

5 

6 

7 

- . 

8. 

1  think  it  went  too  fast. 

1  2 

3 

4 

5 

6 

7 

1  think  it  went  too  slowly. 

9. 

1  would  be  pleased  to 

1  would  not  wish  to  attend 

attend  the  next  AIRMICS 

. 

again. 

conference. 

1  2 

3 

4 

5 

6 

7 

III.  LIFE  CYCLE  MANAGEMENT  MEASUREMENT 
MODELS  -  PREDICTIVE 

Chairman:  Mr.  Lawrence  H.  Putnam 
Quantitative  Software  Management,  Inc. 

PANELISTS 

Barry  U.  Boehm  Bev  Littlewood 

Amrit  L.  C.oel  John  D.  Musa 

Mler  M.  Lehman  Robert  C.  Tausworthe 

Marvin  V.  Zelkowitz 
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LIFE-CYCLE  MANAGEMENT  MEASUREMENT  MODELS:  PREDICTIVE 

SUMMARIZED  BY 
Lawrence  H.  Putnam 


Quantitative  Software  Management,  Inc. 


Current  life  cycle  models  have  been  inade¬ 
quate  to  predict  cost,  schedule,  quality  and  re¬ 
liability.  Group  III  examined  the  problem  from 
three  perspectives:  management  issues,  pheno¬ 
menological  behavior  and  reliability  measurement 
and  prediction. 

Central  to  the  thinking  of  all  was  the  notion 
that  models  were  needed  that  provided  adequate 
accuracy,  faithfulness  to  the  process,  simplicity 
of  use,  timliness,  and  addressed  investment  and 
management  questions  directly  in  management  para¬ 
meters  —  time,  cost,  manpower,  cash  flow,  rate 
of  progress,  effectiveness  and  reliability. 

The  fundamental  issues  identified  are: 

•  Lack  of  standard  definitions  and 
metrics  for  the  life  cycle 

-  activities 

-  phases 

-  milestones 

•  A  detailed  process  model  is  needed 

•  A  catalog  of  existing  descriptive 
and  predictive  models  is  needed. 

The  catalog  should  contain: 

Description  of  model 

Assumptions 

Purpose 

-  Capabilities  (positive  and  negative) 

•  A  careful  evaluation  of  existing 
models  is  needed  (This  should  be  done 
interatively  with  the  creator  to  be 
sure  that  important  characteristics 
and  nuances  are  not  omitted  in  sum¬ 
marization)  . 

•  A  good  life  cycle  model  should 
possess  these  characteristics: 

-  Consider  all  activities  and  phases 

Relate  management  parameters  to 
management  responsibilities 


Plan  Control  Evaluate 


Function  .  .  .  .  .  . 

Cost  .  .  .  .  .  . 

Time  .  .  .  .  .  . 

-  Be  adaptive  to  actual  project  data  and 
requirements  changes  (i.e.,  must  be 
time-varying  or  dynamic). 

Provide  engineering  accuracy  (and  uncer¬ 
tainty  measures  until  it  is  safe  to 
ignore  them  because  of  standards  conven¬ 
tions  (e.g.,  building  codes,  electrical 
codes,  etc.)  in  cost,  schedule  and  qual¬ 
ity 

-  Provide  sensitivity  profiles. 

-  Be  phenomenologically  based. 

Relate  produce  to  resource  comsumption 
(both  statically  and  dynamically)  and 
the  technology  being  applied. 

-  Be  capable  of  future  growth. 

-  Be  able  to  adequately  treat  known  and 
future  system  types  and  development 
environments. 


This  group  was  composed  of:  Lawrence  H. 
Putnam,  of  Quantitative  Software  Management,  Inc., 
Chairman;  Barry  W.  Boehm  of  TRW  Defense  and  Space 
Systems  Group;  Amrit  L.  Goel,  Syracuse  University, 
Meir  M.  Lehman,  Imperial  College  of  Science  &  Tech¬ 
nology/England;  Bev  Littlewood,  City  University/ 
London,  England;  John  D.  Musa,  Bell  Telephone  Lab¬ 
oratories;  Leon  G.  Stuck!,  Boeing  Computer  Ser¬ 
vices,  Inc.;  Robert  C.  Tauswortho  ,  Jet  Propulsion 
Laboratory;  Claude  Walston,  IBM  Federal  Systems 
Division;  and  Marvin  V.  Zelkowitz,  University  of 
Maryland. 


The  Group  III  people  found  it  worthwhile  to 
sub-divide  themselves  into  three  sub-work  groups, 
and  to  devote  their  attention  to  the  special  areas 
of  expertise  in  which  they  could  deal  with  the 
subject  matter  in  greater  depth  and  address  a 
smaller  sub-set  of  the  questions  posed  by  the 
Army  in  a  more  comprehensive  manner.  Three  sub¬ 
ject  areas  were  to  be  explored.  (1)  Reliability 
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Models.  The  people  who  worked  on  this  sub-task 
were  John  Musa,  Bev  Littlewood,  and  Amrit  Goel. 
(2)  Life  Cycle  Models.  The  people  concerned  with 
this  sub-task  were  M.  Lehman,  Claude  Walston, 
Marvin  V.  Zelkowitz.  (3)  Management  Issues  and 
the  Resource  Control  aspects  that  fit  within 
the  managerial  framework.  Barry  Boehm,  Bob 
Tauseworthe  and  Leon  Stucki  addressed  these 
topics. 


We  will  consider  these  in  the  order  of 
Management  Issues,  Reliability  Models  and, 
finally,  the  Life  Cycle  Models. 


MANAGEMENT  ISSUES 


:'he  Management  Issues  sub-task  group  con- 
,ed  itself  with  the  following  questions: 

What  are  the  major  ingredients  in  the  management 
of  software?  What  makes  it  unique?  What  makes 
it  different  from  hardware?  How  should  the 
organizational  structure  relate  to  the  problem 
to  be  solved  in  the  different  phases  of  develop¬ 
ment?  To  what  extent  should  managers  be  tech¬ 
nically  trained?  To  what  extent  should  techni¬ 
cal  personnel  be  managerially  trained?  Are  there 
different  classifications  of  software  that  re¬ 
quire  different  methods  of  management  (For  exam¬ 
ple,  embedded  computers  and  software  vs.  non-em- 
bedded  computers  and  software)?  Are  there  pre¬ 
dictable  crises  in  the  software  life  cycle,  and 
if  there  are,  what  are  the  early  indicators 
associated  with  these  crises? 


Embedded  within  these  broader  questions  then 
is  the  overall  set  of  fundamental  questions 
which  we  hope  will  be  answered.  They  relate  to 
what  needs  to  be  done  to  improve  the  process. 

We  might  define  these  related  actions  under  the 
broader  heading  of  impirical  studies — basically, 
what  we  need  to  know  to  understand  the  process 
better.  What  information  should  be  collected 
about  the  process,  the  product,  and  their  inter¬ 
action  and  for  what  purpose?  What  kind  of 
experiments  should  be  performed?  How  can  we  cap¬ 
ture  and  express  the  idea  of  program  complexity? 
How  can  program  managers  be  convinced  to  conduct 
experiments  on  their  programs?  What  progress  is 
being  made  on  the  transfer  of  learning  from  one 
project  to  another  project  within  an  organiza¬ 
tion  and  between  organizations? 


An  integral  part  of  management,  of  course. 

Is  resource  planning  and  control.  Within  this 
framework  are  those  things  having  to  do  with 
performance  measures  that  will  measure  the  actual 
progress  of  a  project  against  some  time  base 
which  we  conanonly  called  the  milestones  and  in 
terms  of  rate  of  expenditure  of  the  resources 
allocated  to  the  project  (which  typically  are: 
manpower,  dollars,  computer  time).  The  ability 
to  relate  performance  measures  to  the  consumption 
of  resources  has  been  especially  difficult  in 


managing  software  products.  Attempts  have  been 
made  in  terms  of  productivity.  Productivity  has 
been  defined  as  total  number  of  delivered  source 
lines  of  code  divided  by  the  effort  required  to 
produce  the  code.  Basically,  people  are  unhappy 
with  this  definition  in  that  it  doesn't  really 
relate  to  the  rate  of  progress  on  the  project. 

It  is  a  difficult  measure  and  in  some  sense  may  be 
counterintuitive  to  the  common  industrial  inter¬ 
pretation  of  rate  of  production  used  in  the  con¬ 
text  of  the  industrial  production  line. 

The  management  sub- task  group  wrote  these 
managerial  concerns  into  a  group  of  problems  in 
which  they  identified  the  key  factors,  recommended 
an  approach,  and  gave  a  prognosis  with  respect  to 
possibilities  for  success  and  the  time  frame  in 
which  it  might  be  possible  to  achieve  the  solu¬ 
tion  to  the  problem.  The  first  problem  identified 
by  this  group  was  problem:  The  Inadequate  accur¬ 
acy  of  current  models.  This  has  resulted  in  fre¬ 
quent  overruns.  Parameters  are  often  difficult 
to  estimate,  and  the  non-standard  terms  and 
metrics  that  are  used  in  these  various  models 
complicate  interpretation. 


KEY  FACTORS: 

There  are  no  standard  metrics  and  terminology 
within  the  industry,  or  within  major  subdivisions 
of  the  government.  There  are  inadequate  empirical 
applications  of  the  models  (l.e.,  there  is  no 
practical  application  and  subsequent  feed  back  so 
that  the  models  are  selt-improving) . 


RECOMMENDED  APPROACH: 

Establish  standard  definitions;  establish 
refined  data  collection  procedures;  collect  addi¬ 
tional  empirical  feedback  leading  to  refinements 
and  tuning  of  models  to  make  them  better. 


TIMING: 

Reasonably  near  term  (2-3  years). 


•  PROBLEM:  Models  need  to  be  evaluated  with 

respect  to  a  set  of  management -ori¬ 
ented  criteria. 


KEY  FACTORS: 

Timeliness,  updatability ,  definition,  objec¬ 
tivity,  detail,  parsimony.  The  models  should  be 
extensible,  contractible,  tailorable.  These 
should  be  a  pragmatically  understandable  corres¬ 
pondence  between  criteria.  The  models  should 
support  sensitivity  analyses.  And  the  models 
should  be  adaptive,  that  is,  they  should  respond 
to  the  project  dynamics;  what  is  actually  happen¬ 
ing  should  be  fed  in  as  it  occurs  and  the  model 
should  adapt  to  that  in  terms  of  the  future  pro- 
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jection  it  makes  for  the  next  few  time  intervals. 
RECOMMENDED  APPROACH: 

There  is  a  need  to  extract  meaningful  manage¬ 
ment  criteria  from  these  Key  Factors.  Evalua¬ 
tions  should  be  performed  to  establish  a  standard 
accepted  set  of  terminology  to  develop  new 
classes  of  models  that  will  handle  a  broader 
range  of  phases  and  activities  within  the  soft 
ware  development  and  maintenance  process. 


TIMING: 

Reasonably  near  term  (2-3  years) ,  leading  to 
longer  term  pay-offs  in  the  medium  range 
(5-7  years)  period. 


•  PROBLEM :  Current  models  are  not  well  related 
to  the  project  status  indicators. 


KEY  FACTORS: 

The  definition  of  status  indicators,  (for 
example,  CDR,  or  Critical  Design  Review). 
Obsolute  software  standards  (e.g.  MS-1521  and 
MS-881).  Inadequate  detail  (e.g.  work  break¬ 
down-structure,  and  lower  level  milestones). 


RECOMMENDED  APPROACH: 

Define  a  more  detailed  life-cycle  process 
model  (include  a  greater  number  of  lower  mile¬ 
stones  within  the  work  breakdown  structure)  . 
Relate  global  status  indicators  to  the  detailed 
process  model.  Update  the  relevant  software 
standards.  Relate  the  predictive  models  to  the 
detailed  process  model. 


TIMING: 

Reasonably  near  term  (2-3  years). 


•  PROBLEM:  Current  models  are  inadequate  in 
relating  productivity  and  relia¬ 
bility. 


KEY  FACTORS: 

The  terms  are  difficult  to  define.  There 
are  no  standard  definitions. 


RECOMMENDED  APPROACH: 

Develop  new  models  relating  productivity 
with  reliability;  establish  standard  accepted 
definitions  that  adequately  describe  in  a 
meaningful  way  productivity  and  reliability  fea¬ 
tures  that  we  want  to  see  within  the  models. 


TIMING: 

Reasonably  near  term  (2-3  years). 


•  PROBLEM:  The  current  models  do  not  adequately 
cover  some  key  issues: 

-  maintenance,  conversion,  block 
updates 

-  the  impact  of  new  technology 


KEY  ISSUES: 

Understanding  the  underlying  phenomenology  of 
the  software  building  process  and  how  to  use 
it  in  the  model. 

the  unknown  domains  of  applicability  of  the 
models. 


RECOMMENDED  APPROACH: 

determination  of  areas  of  applicability  for 
existing  models  (include  underlying  assump¬ 
tions)  . 

-  develop  additional  models  to  cover  the  poorly 
developed  issues  and  areas;  this  implies 
more  detailed  definitions  and  a  greater  data 
collection  effort. 


TIMING: 

Reasonably  near  term  (2-3  years).  Some  areas 
will  require  better  data  for  significant  pay  offs 
and  this  will  necessitate  longer  periods  of  time 
within  the  mid-range  period  (5-7  years). 


•  PROBLEM: 

A  lack  of  models  for  other  areas  of  manage¬ 
ment  purview 

other  resources  (e.g.  core  requirements) , 
other  situations  (e.g.  distributed  net¬ 
works,  micro-computers) 

personnel  career  progression 

-  life  cycle  dynamics  of  software  as  a 

"Business  Game"  model  similar  to  what  is 
now  done  in  the  large  business  schools 
in  which  a  complete  business  scenario  can 
be  played  out  and  development  outcomes 
determined  depending  on  the  input  and  the 
actions  of  the  players. 


KEY  FACTORS 

Complete  absence  of  models  of  this  type 

-  Non-standard  situations  and  organizational 
structures  within  various  business  enti¬ 
ties  and  various  government  organizat ions. 


1 
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subjective  versus  objective  decision 
making 

requirements  for  such  models  are  not 
recognized 


RECOMMENDED  APPROACH: 

create  awareness  for  the  value  of  such 
models 

-  develop  model  goals,  requirements, 
criteria,  etc. 

develop  adequate  models 

-  train  management  to  use  the  models  in  the 
decision  making  process 

-  sell  management  on  the  utility  of  using 
such  models 


TIMING: 

Reasonably  near  term  (3-5  years).  Prospect 
for  success  is  good  in  modeling  quantitative 
measures  (e.g.,  life  cycle  dynamics,  core). 


RELIABILITY 

The  reliability  subtask  group  address  the 
general  set  of  questions  concerned  with  models 
but  specifically  directed  their  responses  toward 
reliability-oriented  models  to  put  together  an 
ordered  set  of  criteria  for  good  predictive 
models  in  the  reliability  area. 

Accordingly,  the  problems  addressed  in  this 
section  are  ordered  in  terms  of  their  priority 
of  need. 


*  PROBLEM:  Need  for  data. 


KEY  FACTORS: 

A  need  for  execution  time  data  rather 
than  calendar  time  data. 

better  planning  of  data  collection  efforts 
(this  should  be  done  in  conjuction  with 
reliability  researchers) 

need  cost  impact  data 

need  data  on  resources  used  in  identify¬ 
ing  and  collecting  the  data. 


RECOMMENDED  APPROACH: 

Detailed  studies  should  be  undertaken  to: 

specify  what  data  should  be  collected 
and  how. 

-  study  should  be  reviewed  by  the  principal 
researchers  in  the  field. 


TIMING: 

Near  future  (2-3  years).  Prospects  for 
success  are  good. 

•  PROBLEM:  Need  a  comparative  study  of  existing 
reliability  models. 


KEY  FACTORS: 

an  analytical/anatomical  comparison 

-  predictively  comparison 

a  physical  interpretat ion  of  the  para¬ 
meters  of  the  models 

simplicity  and  ease  of  understanding  and 
communicating  in  each  of  the  models. 

-  range  of  applicability. 


RECOMMENDED  APPROACH: 

A  serious  analytical  and  empirical  compara¬ 
tive  study  to  ensure  a  correct  interpretation  of 
the  models  and  the  assumptions  that  have  used 
been  used  in  creating  the  models. 

TIMING: 

Such  a  comparative  study  should  start  in  the 
near  future  and  possibly  could  be  completed  by 
1980. 


•  PROBLEM:  The  need  to  validate  the  assumptions 
used  in  existing  models. 


KEY  FACTORS: 

the  independence  assumption  of  failure 
time 

assumption  of  an  exponential  distribution 
being  the  underlying  relevant  statistical 
distribution. 


RECOMMENDED  APPROACH: 

A  study  should  be  carried  out. 


TIMING: 

Near  future  (2-3  years).  The  success  in  this 
endeavor  would  depend  considerably  on  the  availa¬ 
bility  of  data. 


•  PROBLEM:  Relationship  between  test  and  opera¬ 
tional  environments . 
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KEY  FACTORS: 

-  the  effect  on  the  reliability  measures. 

-  how  to  construct  an  appropriate  test 
environment . 

RECOMMENDED  APPROACH: 

Research  in  the  fundamental  areas. 

TIMING/PROSPECTS: 

Time  of  conclusion  is  not  clear.  This 
appears  to  be  a  difficult  problem. 

•  PROBLEM:  Relationship  between  program  struc¬ 

ture  and  reliability  (including 
combinatoric  relationships). 

KEY  FACTORS: 

-  modern  programming  practices 

-  module  switching  (N-th  order  Markov  pro¬ 
cesses) 

-  information-theoretic  approach 

RECOMMENDED  APPROACH: 

Further  research 

TIMINC/PROSPECTS: 

Medium  term  (3-5  years). 

•  PROBLEM:  What  quality  performance  measures 

are  meaningful  and  useful?  What 
decisions  would  be  supported?  How 
might  management  decisions  affect 
selected  performance  measures? 

KEY  FACTORS: 

-  availability 

-  cost  impact  measure 

-  predict  project  completion 

-  tradeoffs  between  quality  measures  and 
time/cost. 

RECOMMENDED  APPROACH: 

Manager  survey  study. 

TIMINC/PROSPECTS: 

Near  term  (1-2  years).  Propects  for  success 
are  good. 


•  PROBLEM:  Getting  software  reliability  concepts 

accepted  and  used. 

KEY  FACTORS: 

-  selling — convincing  managers  that  these 
concepts  and  techniques  are  useful. 

-  integration  and  simplification  of  concepts. 

-  adapting  reliability  as  a  system  require¬ 
ment.  One  possible  approach  is  that 
reliability  should  be  considered  as  one  of 
the  elements  in  an  evaluation  of  a  proposal. 

Other  recommendations  are  included  under  the 
other  problem  areas. 

TIMING/PROSPECTS: 

(2-5  years).  Contingent  upon  success  in 
other  problem  areas. 

•  PROBLEM:  What  sort  of  error  taxonomy  is  useful? 

KEY  FACTORS: 

-  need  an  end-use  orientation  classification 
scheme 

-  need  to  collect  data 

-  Is  a  multi-variate  model  needed  to  handle 
error  severity  classes ? 

RECOMMENDED  APPROACH: 

Conduct  a  study  on  planned  uses  of  error 
data  to  develop  an  appropriate  classification 
scheme. 

TIMING/PROSPECTS: 

Medium  term  (2-5  years).  Prospects  reason¬ 
ably  good. 

•  PROBLEM:  How  is  changing  technology  going 

to  affect  software  reliability 
measurement? 

KEY  FACTORS: 

-  microprocessors 

-  networking 

RECOMMENDED  APPROACH: 

-  Augment  Rome  Air  Development  Center  micro¬ 
processor  study 

-  Initiate  networking  study 
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TIMING/PROSPECTS: 

Near  term  (2-5  years).  Start  now.  Prospects 
are  good. 


The  Group  dealing  with  LIFE  CYCLE  SOFTWARE 
MODELS  AND  METHODOLOGIES  FOCUSED  ON  THREE 
MAIN  AREAS: 

•  Initial  answers  to  the  questions  posed 
by  AIRMICS. 

•  Some  tentative  early  definitions. 

•  Some  recommendations  for  further  work. 

Question  No.  1.  What  needs  to  be  known  to 
understand  the  development  process  better? 

•  activities  -  the  relationship  between  the 
the  activities,  the  flow  between  the  acti¬ 
vities,  and  the  products  coming  out  of  the 
activities,  for  example  design  for  main¬ 
tainability. 

•  Measurement  quantities.  We  need  all  the 
classes  of  the  measurement  quantities  — 
resource  consumption  measures,  rates  of 
accomplishment,  or  progress,  and  quality 
metrics  i.e.,  a  capability  to  measure 
these  actual  quantities  and  relate  them  to 
the  accomplishment  that  is  being  made, 
measures  of  progress,  quality,  product¬ 
ivity,  as  well  as  just  resource  consump¬ 
tion  in  accomplishment  of  time-related 
milestones. 

•  We  need  real  data  for  each  of  these  acti¬ 
vities  and  measurable  quantities. 

•  A  full  analysis  is  possible  now.  It 
requires  a  concerted  effort  by  a  team  of 
experts  spanning  the  disciplines  involved 
in  the  total  life  cycle. 


Question  No.  2.  What  information  should  be 
collected  about  the  process,  product  and  their 
interactions? 

•  The  answer  to  this  question  is  in  part  an 
answer  to  question  1,  above.  But  we  also 
need  to  know  for  what  purpose  ?  The 
answer  to  this  would  appear  to  be  to  model 
and  use  for: 

•  Management,  control  and  evaluation. 

•  Improvement  of  the  process. 

Neither  can  be  done  adequately  in  order  to 
achieve  full  life  cycle  effectiveness  without  an 
adequate  understanding. 


Question  No.  3.  What  experiments  and  evalu¬ 
ations  need  to  be  performed? 

•  controlled  experiments  should  not  be  used 
because : 

-  We  cannot  isolate  the  problem. 

Extrapolation  is  not  possible  for  small 
projects. 

It  is  too  expensive. 

The  recommended  approach  should  be  to  conduct 
studies,  gather  data  and  tie  back  to  analysis 
based  on  common  definitions  and  standard  measure¬ 
ments,  techniques  and  models. 


Question  No.  4.  How  can  we  capture  program 
complexity? 

•  There  are  a  number  of  existing  investi¬ 
gations  now  ongoing. 

•  We  should  monitor  these  carefully. 


Question  No.  5  .  How  can  we  convince  managers 
to  experiment  on  thier  software  projects? 

•  A  straight  answer  is  —  Don't  attempt  to 
convince  them  to  experiment.  The  real 
question  is  how  to  persuade  managers  to 
collect  data  for  others  to  use.  (There 
is  a  real  problem  here  because  of  fear 
that  collecting  the  data  will  be  used 
against  managers  to  show  that  they  wasted 
resources,  that  they  didn't  manage  effect¬ 
ively.) 

•  A  partial  answer  to  this  question  is: 

(1)  feedback.  It  should  be  a  two-way 
flow.  Data  is  captured  from  the 
managers  to  measure  progress  and  to 
help  improve  the  process,  then  they 
should  get  the  benefit  of  the  feed¬ 
back  to  help  them  manage  better. 

•  A  second  partial  answer  is: 

(2)  automate  the  collection  effort;  make 
it  painless  to  do  so  that  it  doesn't 
interfere  or  take  away  from  the  effort 
that  is  devoted  to  the  project. 


Question  No.  6.  What  progress  is  being  made 
on  the  transfer  of  learning  from  project  to  pro¬ 
ject  and  within  organizations? 

•  Not  much.  But  workshops,  such  as  the 
Software  Life  Cycle  Management  Workshop 
and  conferences  on  the  subject  of  soft¬ 
ware  engineering  help.  At  least  they 
bring  to  the  forefront  an  awareness  of  a 
lack  of  transfer  of  learning  from  project 
to  project,  within  and  between  organiza¬ 
tions. 
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•  In  order  to  leave  an  effective  transfer  of 
learning,  definitions  and  common  terminology  are 
essential. 


Question  No.  7.  What  are  the  criteria  for  a 
good  predictive  model? 

•  Parameters  of  the  model  should  be: 

(1)  based  on  a  standard  set  of  defini¬ 
tions.  For  example,  time,  effort, 
manpower,  end  product  (quality  of 
source) 

(2)  parameters  should  be  measurable. 

(3)  parameters  should  reflect  the  environ¬ 
mental  needs,  not  product  attributes. 

•  Each  model  should  adequately  cover  factors 
causing  variation  in  model  output. 

•  The  set  should  be  adequate  to  cover  the 
entire  life  cycle. 

•  A  clear  understanding  of  the  domain  of 
applicabil ity . 

•  Should  support  management  activity  i.e., 
management  activities  of  planning,  control 
and  evaluation  should  be  able  to  be  dis¬ 
played  against  each  function  together  with 
the  associated  function  time  and  cost  of 
the  activity.  The  table  below  shows  this 
concept. 


Plan  Control  Evaluate 

Function  .  .  .  .  .  . 

.Cost 

T Ime  .  .  .  .  .  . 

Question  No.  8.  How  can  statistical  and  ana- 
lystical  models  be  combined? 

•  We  should  not  concern  ourselves  with  this. 
We  consider  it  non-issue  and  it  would  more 
appropriately  be  left  to  individual  re¬ 
searchers  to  apply  the  appropriate  academ¬ 
ic  tool  in  a  solution  to  the  problem  at 
hand . 


Question  No.  9.  Is  there  a  need  for  a  stan¬ 
dard  set  of  models? 

•  Yes. 

(1)  to  cover  the  life  cycle. 

(2)  for  different  environments. 

(3)  to  handle  factors  involved  in  the 
process,  e.g.,  resources,  reliability, 
growth.  There  would  be  all  kinds  of 
time  phases  in  order  to  take  care  of 
overlapped  activities,  for  example. 


Question  No.  10.  Do  we  need  a  new  set  of 
models,  or  are  there  already  models  that  adequate¬ 
ly  satisfy  fhe  need? 

•  The  answer  to  this  seems  to  be  that  there 
already  exist  adequate  models  flow  which 
to  build  upon,  but  we  need  to  have  a  cat¬ 
aloging  of  these  models  to  define  their 
capabilities,  the  underlying  assumptions 
and  the  validity  of  the  results  that  they 
will  yield.  We  need  an  extension  in  the 
terms  of  the  agreed  upon  definitions  and 
we  need  an  extension  of  these  models  to 
satisfy  the  criteria  in  response  to  ques¬ 
tion  No.  7,  that  is,  it  should  satisfy  the 
management  activities  matrix  —  the  plan¬ 
ning,  control,  and  evaluation,  and  should 
identify  the  functions,  the  time  and  the 
cost  of  each  of  those  activities. 


Question  No.  11.  What  are  the  components  of 
an  overall  software  engineering  methodology? 

•  there  should  be  a  statement  of  objectives 

•  how  to  do  it 

•  a  means  for  quantitative  evaluation  (What 
is  being  done  and  achieved) 

•  manageable 

•  executable  by  normal  people  in  a  normal 
environment 

•  a  definition  of  the  range  of  applicability 


Question  No.  12.  Where  should  software  tech¬ 
nology  be  going? 

•  In  terms  of  products,  which  would  include 
microprocessors  and  their  support  soft¬ 
ware,  the  technological  thrust  should  be 
to  provide  complete  functional  specifica¬ 
tions  and  to  be  able  to  tally  a  defined 
set  of  standard  interfaces. 

•  A  continuous  Life  Cycle  in  which  each 
activity  fully  supports  all  those  that 
follow. 

•  There  should  be  high  level  environmental 
objectives  and  parameters. 


Question  No.  13.  "Are  there  standardizable 
methodologies?"  supposes  that  there  is  a  need 
within  the  Army  and  within  the  Department  of 
Defense  for  standardization.  This  implies  a 
taxonomy.  There  would  be  three  dimensions  in¬ 
volved  in  the  answer.  The  three  dimensions  are: 

•  Environment 

•  Activity  (phases) 

•  What  is  being  addressed  (for  example, 
cost,  reliability,  performance). 


RECOMMENDATIONS: 


Question  No.  14.  The  effect  of  software 
engineering  requirements  and  environmental  factors 
are  fundamental  to  this.  The  answer  to  this  is 
adequately  covered  in  the  responses  to  the  earlier 
questions. 

Question  No.  15.  How  can  we  characterize  the 
methodology?  The  answer  to  this  is  again  the 
answer  to  13. 


Question  No.  16.  Is  there  a  way  to  measure 
the  effective  life  cycle. 

•  The  answer  appears  to  be  yes  and,  as  an 
example,  the  evolution  dynamics  of  Belady 
and  Lehman  is  an  approach  that  provides 
considerable  insight. 


Question  No.  17.  When  should  a  system  be 
redone?  We  need  both  static  and  dynamic  indicators. 

•  Study  is  needed  in  this  area  to  classify 
and  catalog  what  these  indicators  should 
be. 

•  Current  practices  in  industry  may  be  help¬ 
ful  in  this  classification  action. 


With  respect  ot  the  general  questions,  what 
are  the  priorities?  We  should  go  after  solving 
those  issues  relating  to  life  cycle  management  of 
software.  It  appears  that  the  number  one  priority 
is  to  establish  a  common  set  of  definitions.  All 
of  the  others  are  important  but  they  are  hindered 
by  a  lack  of  required  definitions.  All  these  pro¬ 
blems  that  have  been  identified  solvable  in  the 
next  five  years.  Most  of  the  ones  that  have  been 
identified  and  commented  upon  are  solvable  in  the 
next  fiye  years  if  the  effort,  direction  and  re¬ 
sources  are  focussed  on  their  solution. 


What  should  next  year's  questions  be  focussed 

on? 

•  Agreement  on  a  common  set  of  life  cycle 
components,  or  phases. 

•  Status  of  the  current  research  now  on/ 
going. 


•  Data  collection  and  definitions.  Recommend 
setting  up  a  standing  committee  for  soft¬ 
ware  life  cycle  management  data  collect  ion , 
involving  identification  and  definition. 
This  might  be  done  by  some  organization 
such  as  AIRMICS. 

•  Establish  a  catalog  of  methodologies  and 
models.  There  should  be  a  group  to  ident¬ 
ify,  to  improve  and  to  recommend  adoption 
of  appropriate  methodologies  and  models. 

LIST  OF  NEEDED  DEFINITIONS: 

1.  Productivity 

2.  Life  cycle  phases 

3.  Lines  of  code 

4.  Complexity 

5.  Software  maintenance  (modification,  enhance¬ 
ment,  debugging,  error  fixing) 

6.  Error 

7.  Reliability  (quality  metrics) 

8.  Man  month  (effort  measurement) 

9.  Software  (system) 

10.  Verification,  validation. 

CONCLUSION: 

Good  progress  has  been  made  in  this  software 
lifecycle  management  workshop.  It  is  felt  that 
the  life  cycle  management  workship  is  an  important 
forum  focussing  ideas  for  improvement  in  future 
action.  The  results  observed  in  this  workship 
appear  to  be  fruitful  and  encouraging  with  respect 
to  what  was  identified  and  pointed  out  in  the 
first  software  life  cycle  workshop  a  year  ago. 
Continuation  appears  to  be  clearly  indicated. 


TENTATIVE  DEFINITIONS 

1.  Large :  A  software  project  is  large  if  it 
involves  at  last  two  separately  managed 
groups . 

2.  Life  cycle:  The  life  cycle  of  a  software 
project  encompasses  all  the  activities  from 
first  formal  conception  until  final  abandon¬ 
ment.  When  we  refer  to  "life  cycle"  of  an 
activity/phase,  it  must  always  be  qualified 
to  some  extent,  (for  example,  we  should  refer 
to  the  development  cycle  protion  of  the  soft¬ 
ware  life  cycle) . 
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APPENDIX  TO  REPORT  GROUP  III 


(Submitted  by  Leon  G.  Stuck!) 


I .  PROGRAMMING  ENVIRONMENT 


Background 


•  Paradoxically,  the  software  community 
has  automated  everyone’s  work  except  their  own. 

•  Isolated  tools  and  techniques  have  been 
developed . 

•  Software  is  still  extremely  labor  inten 


sive. 

•  Productivity  improvements  in  software 
construction  has  not  kept  pace  with  hardware. 

•  Software  is  rapidly  becoming  the  limit¬ 
ing  factor  in  large  systems. 

•  An  automated  programming  environment 
with  an  integrated  set  of  tools  for  the  manage¬ 
ment,  control,  testing,  and  documentation  of 
each  stage  the  evolving  software  offers  a  means 
for  greatly  reducing  software  costs  and  improv¬ 
ing  software  quality. 

•  Tools  within  an  automated  programming 
environment  should  include: 

-  Source  language  level  interpreters,  for 
statement-at-a-t ime  execution  and  tracing 

-  Compilers,  for  both  program  debugging 
(e.g.,  extended  syntax  checking  and  user  feedback) 
and  optimizing 

-  Cross-compilers 

-  Text  editors,  CRT  terminals 

-  Configuration  management  aids 

-  Automated  verification  and  testing  aids 

-  Interactive  debugging  aids 

•  Functions  provided  by  a  programming 
environment  should  include: 

-  Mechanisms  for  controlling  and  document¬ 
ing  the  communication  process  between  users-ana- 
lysts-programmers-managers . 


-  A  central  repository,  with  supporting 
data  bases,  for  configuration  management  and  con¬ 
trol  of  all  documentation  and  evolving  program 
representations  (design  and  code) . 

-  Quality  assurance  mechanisms  for  check¬ 
ing  adherence  to  project  standards. 

-  Automated  error  collection  and  reporting 
in  support  of  both  quality  assurance  and  config¬ 
uration  management. 

-  A  respository  of  test  data  and  test  re¬ 
sults  traceable  to  user  acceptance/test  criteria. 

Current  State 

•  Exaggerated  claims  have  been  and  continue 
to  be  made  for  isolated  tools  and  techniques. 

•  Most  program  development  is  done  with 
severely  inadequate  tools. 

•  A  compiler  is  frequently  equated  with  a 
programming  environment.  (In  reality,  a  compiler 
constitutes  only  one  small,  albeit  important,  com¬ 
ponent  of  an  automated  programming  environment) . 

•  Much  manual  drudgery  still  prevails  in 
most  current  programming  environments. 

•  Errors  once  discovered  and  removed  may 
reappear  due  to  the  manual  processes  currently 
employed  in  building  today’s  systems. 

•  Management  visibility  into  the  progress 
of  software  development  is  woefully  inadequate. 

•  Experimental  use  Is  being  made  of  selec¬ 
tive  "proven"  tool  and  technique  concepts  not  yet 
widely  available  (e.g.,  static  and  dynamic  ana¬ 
lysis  aids) . 

Future  Trends 

•  Programming  environments  will  be  designed 


Question  No.  14.  The  effect  of  software 
engineering  requirements  and  environmental  factors 
are  fundamental  to  this.  The  answer  to  this  is 
adequately  covered  in  the  responses  to  the  earlier 
quest  ions . 


Quest  ion  No.  iS.  How  can  we  characterize  the 
methodology?  The  answer  to  this  is  again  the 
answer  to  13. 


Question  No.  lb.  Is  there  a  way  to  measure 
the  effective  life  cycle. 

•  Tin'  answer  appears  to  ho  yes  and,  as  an 
example,  the  evolution  dynamics  of  Belady 
and  Lehman  is  an  approach  that  provides 
considerable  insight. 


Question  No.  17.  When  should  a  system  he 
redone?  We  need  both  static  and  dynamic  indicators. 

•  Study  is  needed  in  this  area  to  classify 
and  catalog  what  these  indicators  should 
he. 

•  Current  practices  in  industry  may  he  help¬ 
ful  in  this  classification  action. 


With  respect  ot  the  general  questions,  what 
are  the  priorities?  We  should  go  after  solving 
those  issues  relating  to  life  cycle  management  of 
software.  It  appears  that  the  number  one  priority 
is  to  establish  a  common  set  of  definitions.  All 
of  the  others  are  important  hut  they  are  hindered 
by  a  lack  of  required  definitions.  All  these  pro¬ 
blems  that  have  been  identified  solvable  in  the 
next  five  years.  Most  of  the  ones  that  have  been 
identified  and  commented  upon  are  solvable  in  the 
next  five  years  if  the  effort,  direction  and  re¬ 
sources  arc  focussed  on  their  solution. 


What  should  next  year's  questions  be  focussed 

•  Agreement  on  a  common  set  of  life  cycle 
components,  or  phases. 

•  Status  of  the  current  research  now  on/ 
going. 

TENTATIVE  DEFINITIONS 


RECOMMENDATIONS : 


Data  collection  and  definitions.  Ri-comnu  nd 
setting  up  a  standing  committee  for  soft- 
wan*  life  cycle  management  data  collect  ion, 
involving  identification  and  definition. 
This  might  be  done  by  some  organization 
such  as  AIRMICS. 

Establish  a  catalog  of  methodologies  and 
models.  There  should  be  a  group  to  ident¬ 
ify,  to  improve  and  to  recommend  adoption 
of  appropriate  methodologies  and  models. 


OF  NEEDED  DEFINITIONS : 

Product i vi tv 
Life  cycle  phases 
Lines  of  code 
Complex! tv 

Software  maintenance  (mod i f icat ion ,  enhance¬ 
ment,  debugging,  error  fixing) 

Error 

Reliability  (quality  metrics) 

Man  month  (effort  measurement) 


Software  (system) 

Ver i f icat ion,  val idat ion. 


CONCLUSION: 

Good  progress  has  been  made  in  this  software 
lifecycle  management  workshop.  It  is  felt  that 
the  life  cycle  management  workshop  is  an  important 
forum  focussing  ideas  for  improvement  in  future 
action.  The  results  observed  in  this  workship 
appear  to  he  fruitful  and  encouraging  with  respect 
to  what  was  identified  and  pointed  out  in  the 
first  software  life  cycle  workshop  a  year  ago. 
Continuation  appears  to  he  clearly  indicated. 


Larj^e:  A  software  project  is  large  if  it 
involves  at  last  two  separately  managed 
groups . 

Life  cycle:  The  life  cycle  of  a  software 
project  encompasses  all  the  activities  from 
first  formal  conception  until  final  abandon¬ 
ment.  When  we  refer  to  "life  cycle"  of  an 
act ivi ty /phase,  it  must  always  be  qualified 
to  some  extent,  (for  example,  we  should  refer 
to  the  development  cycle  protion  of  the  soft¬ 
ware  1 ife  cycle) . 
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and  selectively  implemented. 

•  Hardware  manufacturers  will  provide 
machine/ language  dependent  environments. 

•  Techniques  will  developed  to  isolate  lan¬ 
guage  and  operating  system  dependencies  as  much 
as  possible  (e.g.,  attempts  will  be  standardize 
the  interfaces) . 

•  HOL  standardization  within  DOD  will  make 
it  possible,  for  the  first  time,  to  achieve  a  rich 
program  development  environment  accessible  to 
larger  numbers  of  people. 

•  The  "National  Software  Works"  concept  will 
provide  valuable  knowledge  on  the  success  and 
failure  of  many  of  these  concepts  in  a  distribut¬ 
ed  environment. 

•  Additional  textual  and  graphical  tech¬ 
niques  and  automated  tools  will  be  developed  for 
representing,  documenting,  and  controlling  the 
iterative  nature  of  early  phases  of  program  de¬ 
velopment  (i.e.,  requirements  and  design). 

•  An  integrated  framework  will  be  develop¬ 
ed  for  applying  numerous  analytical  techniques 
(e.g.,  consistency  and  completeness  of  testing, 
formal  proof  techniques  for  selected  system 
components,  static/dynamic /symbolic  analysis  of 
subsystems) . 

•  The  theory  of  testing  will  receive  more 
academic  attention  than  in  the  past. 

•  Improved  techniques  will  facilitate  thf 
certification  and  recertification  process  of 
future  systems. 

II.  LANGUAGES  AND  ARCHITECTURE 
Background 

•  Testing  has  historically  been  and  cont¬ 
inues  to  be  very  costly. 

•  The  concept  of  built-in  test  circuitry  in 
hardware  is  widely  acceptable  and  increasing  in 
application.  Similar  approaches  can  and  should 
be  applied  to  software. 

•  Top  down  elaboration  and  refinement  of 
acceptance/test  criteria  can  be  generated  in  par¬ 
allel  with  system  development. 


•  Acceptance/test  criteria  when  incremen¬ 
tally  developed  and  included  in  source  code  via 
special  comments  (or  test  assertions)  have  been 
shown  in  experiment  to  improve  software  quality 
and  reduce  testing  time. 

o  This  concept  can  be  used  in  conjunction 
with  both  dynamic  run-time  analysis  systems  and 
formal  verification  systems. 

Current  State 

•  Several  prototype  systems  have  been  built 
or  designed;  however,  none  are  currently  opera¬ 
tionally  used. 

Examples: 

-  Stucki's  experiments  with  a  PL/1  proto¬ 
type  system  at  UCLA. 

-  University  of  Texas  Gypsy  programming 

system. 

•  Current  language  work  on  DOD-1  has  pro¬ 
vided  "an  opening"  through  a  very  fuzzy  assertion 
concept. 

-  An  assertion  statement  has  been  pro¬ 
vided  in  the  language,  but  its  syntax  and  seman¬ 
tics  and  use  are  unspecified  at  this  time. 

Future  Trends 

•  Further  procedures  will  be  developed  for: 

-  specifying  acceptance/test  criteria 
during  the  system  requirements  phase,  and 

-  refining  the  acceptance/test  criteria 
throughout  the  subsequent  design  and  construction 
phases . 

•  Language  work  will  provide  new  and  more 
powerful  constructs  for  expressing  self-test  and 
monitoring  concepts. 

•  Automated  tools  employing  these  concepts 
will  be  able  to  greatly  improve  the  testing  and 
maintenance  processes. 

III.  FRAMEWORK  FOR  MODELLING  AND  SIMULATION 
Background 

•  Modelling  and  simulation  have  been  used 
widely  by  various  analytical  disciplines. 

•  The  models  of  the  various  disciplines 
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have  generally  been  incompatible. 

•  There  is  a  need  to  provide  a  framework  and 
data  base  mechanism  for  controlling  and  accumulat¬ 
ing  knowledge  of  a  given  system  gained  through 
various  modelling  and  simulation  activities. 

•  Executive  and  utility  functions  include: 

-  Model/Data  input  preparation  and  storage 

-  Assistance  in  the  creation  and  mainten¬ 
ance  of  interfaces  between  models 

-  Assistance  with  output  report  prepara- 

t  ion 

Current  State 

•  The  Air  Force  is  currently  studying  the 
requirements  for  at  least  one  such  system  (i.e.f 
General  Modelling  System  project  under  ICAM) . 

•  Other  industry  efforts  in  CAD/CAM  (com¬ 
puter  aided  design/computer  aided  manufacturing) 
are  exploring  this  area. 

•  Interfaces  to  hardware  are  increasing  as 
digital  computers  replace  analog  devices. 

Future  Trends 

•  Prototype  systems  will  be  built  and 
studied.  (Application  specific  systems  will  be 
available.) 

•  Systems  will  support  hierarchies  of  models 
as  well  as  interdisciplinary  interfaces. 

•  Increased  interaction  will  also  be  involv¬ 
ed  with  actual  sensor  systems. 
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Following  are  the  panel's  observations  and 
recommendations  on  the  use  of  metrics  to  improve 
the  understanding  and  management  of  the  software 
development  and  maintenance  process. 


We  see  it  encouraging  that,  compared  to  last 
year's  workshop,  the  present  papers  are  more 
evaluative  than  speculative.  This  trend  should 
continue,  moreover,  emphasis  should  be  on 
jollcetiruj  facts.  During  its  short  history, 
software  sciences  have  been  characterized  by  a 
large  number  of  ideas,  techniques  and  tools  pro¬ 
posed,  with  the  result  that  there  are  now  more 
ideas  available  than  necessary  or  possible  to 
apply.  Yet  there  is  a  definite  shortage  of 
tes'ed  ideas .  Institutions  and  universities 
have  been  developing  new  techniques  and  approach¬ 
es  to  improve  the  process.  At  the  same  time 
builders  of  large  systems,  and  those  who  are  in 
charge  of  maintaining  these  extremely  complex 
man-made  objects,  are  still  forced  to  use  anti¬ 
quated  methods.  Neither  party  is  at  fault:  the 
problem  i9  that  there  is  no  way  to  demonstrate 
whether  an  idea  is  viable  and  whether  it  will 
beneficially  influence  the  development  process. 
Thus  gathering  facts  about  the  software,  and  about 
the  process  developing  it,  is  the  most  important 
next  step,  without  which  there  is  no  hope  for 
successful  transfer  of  technology. 

But  merely  gathering  facts  is  not  enough. 
Facts  should  be  structured  and  appear  In  a  format 
that  permits  the  comparison  of  systems,  situa¬ 
tions,  processes.  We  are  convinced  that  there 
already  exist  proposed  and  quite  {'remising 
n etrics  which,  although  applied  so  far  only  to 
limited  samples,  showed  interesting  results. 
Consequently,  instead  of  inventing  additional 
metrics  and  thus  increase  unnecessarily  the 
variety  of  available  approaches,  we  must  broaden 
the  basis  on  which  existing  metrics  are  applied. 
Coming  to  mind  are  proposals  by  Halstead,  McCabe, 
the  Be lady-Lehman  measures  performed  on  large 
systems,  and  others  found  in  the  literature.  We 
should  concentrate  on  a  handful  of  the  most 
promising  approaches,  align  them  with  each  other 
and  standardize. 


We  may  gain  confidence  in  these  metrics  by 
examining  their  usefulness  in  three  roles.  One 
role  is  to  extract  generally  valid  laws  about 
the  behavior  of  large  systems,  large  projects  and 
that  of  programmers.  The  second  is  to  predict 
the  evolution  of  the  very  project  or  system  being 
measured.  The  third  factor  is  psychological, 
namely  the  feeding  back  of  the  observations,  and 
predictors  based  on  them,  to  the  people  who 
created  or  caused  the  process  to  happen.  For 
example  Elshoff  of  CM  Research  found  it  often 
useful  to  make  visible  the  otherwise  invisible 
object,  the  program  itself. 

In  fact,  we  are  talking  about  the  existing 
and  sufficiently  validated  metrics,  many  of  them 
already  applied,  successfully  and  independent ly 
of  hardware  characteristics,  to  monitor  the 
development  of  new  projects  and  the  maintenance 
of  older  systems.  Clearly,  if  wc  want  to  thor¬ 
oughly  and  carefully  evaluate  the  usefullness  of 
an  approach,  we  cannot  rely  exclusively  on  com¬ 
puter  scientists  and  software  engineers:  other 
experts  must  also  participate.  Psychol og i st s 
for  example  are  trained  to  evaluate  complex  situ¬ 
ations  using  the  rigorous  and  well  established 
methods  of  experimental  design  and  statistics. 
What  we,  therefore,  propose  is  a  mul t i-di sic i- 
plinary  approach  toward  the  evaluation  of  the 
already  existing  but  sparsely  used  ideas  in 
order  to  weed  out  bad  approaches,  and  to  gradual¬ 
ly  improve  and  refine  the  ones,  whose  potential 
use  for  any  or  all  of  the  above  mentioned  three 
roles  is  immediate.  It  is  also  important  that 
we  restrict  measurements  to  a  handful  of  observ¬ 
ables,  such  as  Halstead's  operators  and  operands, 
and  then  deduce  the  other  attributes  such  as 
portability,  modifiability,  maintainability  from 
primitive  metrics. 

This  leads  us  to  the  most  difficult  attri¬ 
bute  of  the  process  yroduef ivi i y .  While  we 
believe  in  its  importance,  we  cannot  accept  its 
current  unit  of  measure,  namely  lines  of  code 
per  unit  time  or  man-month.  The  reason  for  tills 
disbelief  is  twofold.  First,  most  programmers 
will  be  just  modifying  and  maintaining  already 
existing  programs.  Secondly,  in  the  future  more 
and  more  new  programs  will  be  constructed  out  of 
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off-the-shelf  components.  Whether  in  modification 
and  maintenance,  or  svstem  composition  from  larger 
components,  the  line  of  code  measure  of  producti¬ 
vity  immediately  fails. 

We  measure  productivity  for  perhaps  two  rea¬ 
sons.  One  is  to  monitor  costs  and  the  second  is 
to  predict  the  resources  necessarv  for  the  devel¬ 
opment  of  new  products.  But  the  lines  of  code 
generated  Is  just  one  of  the  many  components  of 
the  total  cost.  Clearly,  the  quality  of  develop¬ 
ment  influences  the  cost  of  maintenance  and  modi¬ 
fication,  to  be  performed  over  a  long  period  of 
time.  Thus,  if  we  want  to  measure  productivity 
at  ail,  then  it  has  to  be  combined  tilth  a  metric 
capturing  quality*  Only  then  can  we  have  a  solid 
measure  for  prediction,  as  well  as  for  comparison 
of  different  systems  and  projects. 

As  already  indicated  before,  the  major  ob¬ 
stacle  to  progress  and  improvement  is  the  d  ffi- 
culty  to  transfer  technology,  i.e.,  ideas  ii to 
real-life  production  situations.  Take,  for  ex¬ 
ample,  a  methodology,  which  in  a  small  environ¬ 
ment,  and  mostly  by  the  inventors  of  the  method, 
is  believed  to  significantly  better  than  the 
currently  used  methods.  First,  the  new  idea  must 
fall  on  fertile  ground.  This  means  that  not 
only  do  the  receivers  of  the  idea  have  first  to 
be  educated  on  the  novelty  being  proposed  but 
their  mind  has  to  be  open  and  well  informed  about 
the  large  variety  of  other  alternatives.  Only 
then  can  a  dialogue  develop  and  factual  evalua¬ 
tion  take  place  before  commitments  to  the  new 
method.  Second,  the  dominant  factor  in  success¬ 
fully  transferring  technology  is  that  a  new  pro¬ 
posal  must  demonstrate  its  viability  by  facts. 
Otherwise  there  is  no  change  to  transfer  method¬ 
ology. 

Project  management  is  right  in  refusing 
untested  methods,  untested  ideas  and  techniques. 
Proposals  must  be  demonstrated  to  be  beneficial 
to  the  project.  The  major  problem  now  is  that  we 
do  not  have  any  place,  any  forun  or  any  organiza¬ 
tional  entity  tiherc  at  a  reasonable  scale,  methods 
can  be  tested  and  their  viability  demonstrated. 
Thus  the  creators  of  ideas  remain  frustrated. 

They  never  see  the  utility  of  the  ideas  on  which 
they  work  so  hard.  Yet  they  should  know  which 
Ideas  make  sense  and  which  not.  It  is  better  to 
know  that  an  idea  does  not  make  sense  in  real 
life  than  to  remain  with  the  uncertainty  about 
its  value  and  then  blame  the  developers  for  not 
implying  something  which  is  supported  only  by 
speculat ion. 

This  leads  us  to  the  problem  of  where  to 
generate  facts  about  new  proposed  methods,  tech¬ 
niques,  tools  and  other  novelty.  Why  does  the 
Army  not  set  aside  resources  for  the  sole  purpose 
of  validating  the  ideas  created  inside  and  out¬ 
side  Its  own  organization?  Unusable  novelty 


would  rapidly  disappear  while  usable  ones  find 
their  way  into  development,  thus  improving  the 
overall  quality  of  the  software  life  cycle.  But 
even  without  such  an  ideal,  separate  organization, 
systematic  gathering  of  facts  on  real  projects 
would  still  tremendously  improve  the  learning 
process  and  encourage  the  flow  of  information 
about  techniques  and  tools  because  measuring  with 
agree*1  upon  and  aligned  metrics  facilitates 
comparison  of  methods  applied,  projects  managed 
and  software  produced.  Again,  emphasis  must  be 
on  facts,  and  on  measured  and  comparable  facts. 

At  the  workshop  it  became  also  clear  that 
our  universe  of  discourse  is  not  homogeneous  any 
more.  There  is  no  such  thing  as  a  typical  pro¬ 
gram,  typical  project,  or  typical  situ.it  ion. 

What  is,  indeed,  badly  needed  is  the  taxonomy, 
the  olassi  float  -ion  of  may  aspects  of  the  life 
cycle.  Before  we  even  start  measuring,  we  must 
know  precisely  what  we  measure  and  what  we  com¬ 
pare  against.  We  must  set  up  the  metrics  and 
the  measurements  according  to  whether  they  are 
about  a  small,  medium  or  large  size  project, 
whether  we  measure  an  on-line,  interactive  or 
batch  system  or  its  development,  and  so  forth. 

The  other  important  aspect  of  taxonomy  is 
that  we  have  to  recognize  the  limits  of  validity 
of  all  the  metrics  and  models  which  we  apply. 
Similar  to  physics,  life  cycle  management  must 
also  have  scaling  or  model  rule  effects.  F.very 
metric,  every  measurement  method  has  its  domain 
of  validity.  Beyond  this  domain  of  validity,  one 
may  have  to  live  with  false  results  or  else  must 
adjust  the  metric  with  some  suitable  parameters. 
Moreover,  classification  in  the  software  develop¬ 
ment  and  maintenance  must  be  along  many  dimen¬ 
sions:  development,  specification  •  nd  standard¬ 
ization  of  error  classes,  categories  oi  programs, 
processes  and  so  forth.  Briefly:  taxonomy  is 
one  of  the  most  important  prerequisites  to  good 
measurement  and  then  to  good  and  valid  comparison 
of  the  measured  objects. 

A  specifically  Important  case  for  precise 
definitions  and  standardization  is  the  case  of 
milestones.  In  general,  it  is  desirable  to  sub¬ 
divide  a  project  in  two  dimensions.  Time  is 
one  of  the  dimensions:  one  would  like  to  see 
the  precise  transition  point  from  one  phase  of 
the  life  cycle  to  another;  for  example  the  point 
defined  by  the  end  of  design  and  the  beginning 
of  implementation.  The  other  dimension  is  the 
product  itself,  namely  its  decomposition  into 
major  components.  It  is  also  necessarv  to  de¬ 
compose  the  cost  estimates  and  then  the  actual 
cost  Items  along  both  dimensions.  Thus  we  must 
find  ways  to  precisely  specify  and  mutually 
agree  upon  this  two  dimens ional  grid  which  is 
applied  over  the  total  project  in  the  time  and 
the  product  domains.  Without  agreed  upon 
definitions,  such  as  the  events  wliich  specify 


the  transftion  from  one  phase  of  the  life  cycle 
to  another,  milestones  have  absolutely  no 
meaning  and  their  use  probably  causes  more 
confusion  than  allow  for  sound  monitoring  and 
comparison . 

A  few  words  about  tools.  We  mean  here  tech¬ 
niques  and  instruments  which  are  necessary  to 
extract  the  facts  and  then  form  metrics:  the 
tools  of  data  collection.  Interestingly  enough 
we  all  agree  that  there  already  exist  built-in 
methods,  and  techniques,  which  continuously 
collect  data  which  in  turn  are  never  interpreted 
or  used.  In  fact  we  believe  that  one  could  start 
immediately  at  practically  2ero  cost  to  gather 
data  without  the  additional  building  of  new 
tools.  Compilers  are  an  example.  During  compi- 
iat  ion  large  amounts  of  significant  data  are 
collected,  but  after  having  produced  the  code, 
the  contents  of  internal  tables  become  discarded. 
Intelligent  use  of  already  generated  data  as  a 
basis  for  metrics  and  meaningful  comparisons 
within  and  between  the  different  systems  and 
projects  would  be  an  almost  zero  cost  activity. 

We  invite  the  Army  to  first  look  around  for 
already  existing  tools  and  data  already  being 
gathered  before  a  vast  and  expensive  tooling  up 
of  projects  starts. 

We  consider  the  order  of  importance  of 
things  to  be  done  as  implied  in  the  order  above. 
Almost  all  proposals  are  doable  within  the  next 
five  years.  An  exception  is  perhaps  the  error 
taxonomy  which  should  be  a  research  effort.  We 
see  also  quite  dark  with  respect  to  a  good  com¬ 
prehensive  and  sensible  productivity  measure. 

The  Workshop  Chairman  wanted  some  questions  for 
next  year's  workshop.  Well,  Barry  Boehm  pro¬ 
posed,  the  best  such  question:  "What  happened 
to  the  recommendations  of  the  previous  year?" 

We  hope  for  the  best. 

We  do  not  propose  here  specific  research 
activities  now.  Rather,  we  call  attention  again 
to  the  multi-disciplinary  approach.  The  •'ole 
or  psychological  research,  particularly  its  role 
in  evrluating  proposed  methods  and  techniques, 
should  be  sign! f Xcantly  increased.  Experts  from 
sociology  and  management  sciences  should  also 
play  an  increasing  role  in  life  cycle  related 
research.  Also,  the  use  of  time  series  analysis 
must  be  introduced.  But  in  any  case,  future 
research  should  be  based  on  actual  data,  more¬ 
over,  or.  aligned  data.  As  long  as  scattered 
research  grcups  all  define  and  Interpret  their 
own  data,  or  use  other  people's  unaligned  data, 
we  cannot  expect  that  transfer  of  knowledge  from 
one  place  to  another  be  possible.  In  fart,  we 
propose  that  a  central  data  ban*'  and  clearing 
house  for  data  be  established  within  boD  In 


order  to  provide  badly  needed  factual  information 
as  a  basis  for  coordinated  research.  We  under¬ 
stand  that  the  Rome  Air  Development  Center  will 
soon  be  ready  to  play  this  extremely  important 
role.  Also  along  central i zat ion,  we  propose  the 
rigorous  definition  of  the  following  ten  most 
important  terms:  the  six  or  seven  phases  of  the 
life  cycle:  requirement,  specification,  design, 
etc:  complexity,  quality,  productivity  (and 
probably  all  the  "abilities"  which  are  so  heavily 
used,  yet  never  defined).  Some  efforts  already 
exist  within  the  Air  Force,  the  Army  and  CITE 
Corp.  Without  such  definitions  technical  people 
in  large  organizations  are  forced  to  use  local 
definitions  or  take  as  source  the  trade  magazines 
and  professional  literature. 

In  summary,  we  believe  that  a  i  Iras  tic  an<l 
mas&ii'e  shift,  to  fact  fi >uJi n<j  a>ui  to  oy\} aKizinj 
the  know! ede  oe  already  hai\‘3  must  characterize 
the  next  years  to  come. 
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Abstract 

An  appraisal  of  past  experiences  relevant  to 
achieving  awareness  of  the  cost  of  software  is 
provided  in  terms  of  personal  recollections  about 
the  "good  old  days".  The  difference  between  what 
we  plan  to  do  and  what  we  really  do  in  developing 
software  is  discussed  and  identified  as  a  signi¬ 
ficant  source  of  the  cost  problem.  A  striking 
similarity  between  the  properties  of  computer  pro¬ 
grams  and  the  characteristics  of  the  software  de¬ 
velopment  process  is  suggested.  Application  of 
computer  program  analysis  tools  to  support  de¬ 
tailed  evaluation  of  the  development  process  is 
proposed.  Some  potential  benefits  regarding  im¬ 
proved  understanding  of  software  production  costs 
are  discussed  and  related  to  possible  modification 
of  current  software  procurement  practices. 


Introduction 

A  few  years  ago  a  good  friend  of  mine  wrote  a 
very  interesting  letter  entitled,  "Random  Thoughts 
on  Software  Integrity,  or.  Nostalgia  for  the  Good 
Old  Days".  Among  other  things,  the  letter  served 
to  refresh  my  recollection  of  the  good  old  days 
while  stimulating  something  akin  to  at  least  a 
rough  comparison  of  then  and  now. 

In  order  to  get  my  thoughts  straight  I  found 
I  had  to  determine  approximately  when  "then"  stop¬ 
ped  and  "now"  started.  Having  worked  continuously 
in  one  way  or  another  with  the  production  of  com¬ 
puter  programs  since  sometime  in  1960,  It  is  per¬ 
haps  meaningful  to  Identify  some  point  in  time 
which  separates  the  good  old  days  from  whatever 
one  might  call  more  recent  times.  For  me  such  a 
delineation  comes  at  approximately  the  midway 
point,  that  is  in  late  1967  or  early  1968.  As  I 
pursued  this  train  of  thought,  I  became  aware  (or 
finally  admitted  to  an  awareness)  of  certain  facts. 
1  believe  that  my  findings  are  relevant  to  any 
discussion  on  the  claimed  cost  and  difficulty  of 
developing  software,  and  I  hope  to  demonstrate 
relevance  in  the  following  paragraphs. 

In  Retrospect 

At  some  point  early  in  my  rambling  thought 
process  on  the  subject,  I  found  myself  hard  put  to 
answer  the  question,  "When  did  things  start  to  go 
wrong?".  After  considerable  soul  searching,  I  was 
able  to  settle  upon  a  fairly  specific  time  period 


during  which  I  had  begun  to  understand  that  there 
were  certain  problems  associated  with  software 
development.  It  is  especially  important  to  note 
that  I  have  not  intimated  that  software  develop¬ 
ment  was  problem  free  prior  to  1968  and  plagued 
with  problems  thereafter.  1  have  simply  concluded 
that  in  late  1967  and  early  1968  something 
happened  which  prompted  (in  me)  a  rather  keen  (but 
previously  non-existent)  awareness  of  some  soft¬ 
ware  development  problems.  I  found  it  instructive 
for  my  purposes  to  attempt  to  identify  whatever  it 
was  that  was  special  or  different  about  that  point 
in  time  and  have  related  memorable  characteristics 
below  in  no  particular  order. 

•  I  had  recently  been  given  my  first  real 
fiscal  responsibility  within  a  major 
(large  scale)  software  development  activ¬ 
ity. 

•  I  was,  prior  to  that  time,  only  remotely 
aware  of  the  fact  that  software  is  devel¬ 
oped  for  a  customer.  At  this  point  I  be¬ 
gan  to  be  exposed  to  the  need0,  hopes, 
fears,  and  frustrations  of  a  customer  or 
a  regular  (almost  daily)  basis. 

•  I  was  asked  for  the  first  time  to  deliver 
a  large  program  to  a  customer.  In  parti¬ 
cular,  it  was  a  program  which  contained  a 
number  of  large,  integral  elements  abort 
which  I  personally  knew  little  nr  nothing. 

•  I  become  aware  of  the  existence  of  the 
incomplete  requirement  specification. 

More  importantly,  I  became  convinced  that 
it  nlayed  a  critical  part  in  a  supposedly 
"formal"  software  development  process. 

•  I  was  asked  (albeit  very  indirectly)  id 
not  in  so  many  words  to  compromise  "in¬ 
tangible"  quality  in  favor  of  tangible, 
timely  (on  schedule)  delivery.  I  was  sub¬ 
sequently  required  to  "explain  away'  pro¬ 
blems  or  relate  them  to  known  errors  and 
lack  of  specificity  in  the  requirement 
specification. 

I  am  not  at  all  sure  which  of  the  above  was 
most  instrumental  in  changing  my  view  of  what  -  off- 
ware  development  is  all  about.  Perhaps  more  ii 
portant  than  any  of  the  individual  items  was  u . . i - 
mately  the  frame  of  mind  which  came  from  living 
through  arid  coping  with  these  new  (for  me)  experi¬ 
ences.  The  most  striking  characteristics  of  my 
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new  frame  of  mind  was  a  strong  realization  that 
software  development  costs  a  lot  of  money. 

Coupled  with  this  was  the  growing  feeling  that 
there  were  some  customers  who  were  hard  to  con¬ 
vince  that  they  had  gotten  their  money's  worth, 
especially  if  the  software  did  not  work  exactly 
as  expected  or  better.  Unfortunately,  speci¬ 
fying  exactly  what  is  expected  of  software  has 
proven  to  be  at  least  as  difficult,  if  not  more- 
so,  than  specifying  things  in  general  [1,2,3]. 

In  fact,  a  great  deal  of  the  thinking  that  has 
been  given  to  the  cost  and  quality  of  software 
has  concerned  the  difficulties  inherent  in 
specifying  intent  ( i . e . ,  requirements),  test¬ 
ing  and  demonstrating  satisfaction  of  those 
requirements,  and  providing  for  full  and  time¬ 
ly  communication  between  software  developers  and 
users  [4, 6,6,7, 8]. 

About  Software  Life  Cycle  Cost 

So  far,  I  have  simply  related  some  of  the 
details  and  subsequent  conclusions  derivable 
from  a  conversation  with  myself  about  the  good 
old  days.  It  is  perhaps  apparent  that  I  have 
more  or  less  dispelled  the  notion  that  there 
really  were  any  good  old  days,  but  rather  that 
there  was  a  time  when,  for  a  number  of  reasons, 
programming  was  fun  and  my  worries  were  few. 

As  is  often  the  case  with  my  rambling 
thoughts  about  software  development,  however, 

I  eventually  found  myself  taking  a  hard  look  at 
the  development  process  from  a  different  but 
potentially  useful  point  of  view. 

Most,  perhaps  all,  people  who  claim  to  know 
how  software  gets  developed  roughly  view  the  dev¬ 
elopment  process  as  a  serial  one  which  includes 
the  following  (or  equivalent)  events: 

1)  Concept  (Requirements)  Definition 

2)  Detailed  Requirements  Specification 

3)  Preliminary  Design 

4)  Detailed  Design 

5)  Code  and  Debug 

6)  Checkout 

7)  Test  Planning 

8)  Test  Execution 

9)  Test  Evaluation 

10)  Acceptance  and  Use 

11)  Maintenance  (modification)  and  Re-Test 
(as  required) 

There  are  many  variations  which  are  more  or 
less  equivalent  to  the  above  and  apparently  many 
different  Impressions  about  the  proper  ordering 
of  the  events  [9,10,11,12,13].  For  purposes  of 
this  discussion  we  can  assume  that  some  major 
software  development  activities  proceed  through 
phases  leading  to  the  above  sequence  of  events 
with  the  usual  Iterative  occurrence  of  events  4 
through  9. 


It  appears  that  we  can  be  comfortable  with 
(and  readily  accept)  a  not-qui te-accurate  picture 
of  the  development  process  to  which  we  attribute 
serial  orderliness  and  implicit  continuity.  Fur¬ 
ther,  and  more  important,  we  are  hard  pr(  ;sed  to 
learn  very  much  about  the  real  costs  of  software 
development  until  we  can  shed  the  notion  of  a 
fixed  sequence  of  events  and  come  to  grips  with 
the  complex  and  highly  dynamic  interaction  of 
these  events  which  is  characteristic  of  much  soft¬ 
ware  development  activity.  Fcr  instance,  if  we 
were  to  draw  a  flow  diagram  of  the  development 
process  (both  "before"  and  "after"),  we  could  well 
see  the  kind  of  contrast  illustrated  in  Figures  1 
and  2. 


Figure  1.  A  "Before"  View  of  Software  Production 
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Production 


Towards  Understanding  Production  Cost 

It  is  probably  not  through  sheer  coincidence 
alone  that  we  use  the  word  "program"  for  very 
large  software  development  projects.  If  put  to 
the  test,  we  can  find  many  similarities  between 
the  properties  of  the  end  item  (i.e.,  the  compu¬ 
ter  program)  and  the  "program"  or  process  through 
which  it  was  produced.  We  have  labored  long  and 
hard  to  develop  techniques  and  tools  [14,16,16, 17, 
18,19,20]  which  provide  valuable  insight 
into  the  intricate  and  highly  complex  interactions 
of  the  components  of  computer  programs.  We  have 
rigorously  applied  such  tools  in  the  study  of  com¬ 
puter  program  efficiency  and  have  saved  countless 
thousands  of  seconds  of  computer  time.  Unfortuna¬ 
tely,  we  have  not  done  nearly  so  well  in  shaving 
seconds  off  the  time  required  for  development  of 
complex  software  systems.  Perhaps  it  is  time 
that  we  seriously  considered  application  of  such 
tools  in  investigation  of  the  other  kind  cf  pro¬ 
grams.  Maybe  then  we  could  begin  to  develop  a 
better  understanding  of  how  software  really  gets 
put  together  and  where,  in  the  process,  the  high 
cost  of  software  truly  lies  and  where  efforts  to 
reduce  excessive  cost  can  and  should  be  concentrat¬ 
ed  [21,22] 


A  Proposal 


I  am  particularly  intrigued  with  the  poten¬ 
tial  application  of  program  path  analysis  and 
usage  monitoring  tools  [23,24,25]  to  help  achieve 
a  more  thorough  understanding  of  what  really  takes 
place  deep  within  the  development  process.  I 
suggest,  for  example,  that  something  along  the 
following  lines  could  be  done.  First,  we  devise  a 
project  (program)  plan  which  possesses  all  the 
conditional  branching  (go  back  and  redo,  etc.) 
characteristics  which  are  illustrated  in  the  pre¬ 
viously  defined  schematic  labelled  ArTER.  Then, 
as  we  think  of  the  programs  tic  events  much  like 
the  segments  or  units  of  code  within  a  program, 
we  may  define  a  convenient  notation  and  procedure 
for  developing  path-like  sequences  which  alone  or 
in  combination  represent  potential  modes  of  tra¬ 
versing  the  network  [25,26,27].  Now  if  we  also 
took  steps  to  assign  time-to-complete  (UC)  and 
cost-to-complete  (CTC)  values  to  each  of  the  ele¬ 
ven  previously  Identified  events,  it  is  possible 
to  estimate  significant  project  cost  and  schedule 
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variations  as  a  function  of  our  "best  guesses" 
at  the  way  project  work  will  actually  proceed. 

We  must  be  careful  however  to  consider  the 
following: 

1)  TTC  and  CTC  values  are  probably  varia¬ 
ble  and  are  functions  of  many  factors 
including  1)  state-of-the-art  of  the 
specific  technological  area,  2)  level 
of  experience  and  expertise  of  availa¬ 
ble  personnel,  3)  event  count  (i.e., 
it  may  or  may  not  be  cheaper  to  write 
detailed  requirements  for  the  third 
time  than  for  the  second  time),  and 

4)  event  predecessor  relationships 
(i.e.,  it  may  cost  more  (or  less)  to 
revise  or  redevelop  a  preliminary  de¬ 
sign  after  extensive  testing  has  been 
completed  than  before). 

2)  Where  appropriate  perhaps  in 
accordance  with  existing  procurement 
regulations  or  customer-contractor 
agreements)  those  branching  character¬ 
istics  which  are  to  be  disallowed  must 
be  taken  into  consideration.  For  exam¬ 
ple,  there  is  at  least  a  common  sense 
requirement  that  test  evaluation  is  not 
followed  by  preliminary  design  more 
than  n  times.. 

3)  It  may  be  important  to  avoid  the  prema¬ 
ture  conclusion  that  the  path  1 -2-3-4- 
5-6-7-8-9-10-11  obviously  presents  the 
most  appealing  cost  picture,  because  it 
may  well  be  the  most  unrealistic  path 

in  terms  of  potential  development  activ¬ 
ity. 

Finally,  we  might  look  briefly  at  one  possi¬ 
ble  derivative  of  the  approach  briefly  outlined 
above.  For  example,  consider  the  possibility  of 
a  customer  somewhere  who: 

1)  has  a  problem  to  solve  which  requires 
procured  services  for  the  production  of  a 
software  system,  and 

2)  is  willing  and  able  to  establish  con¬ 
straints  of  the  type  mentioned  above  in  2. 

We  can  then  conceive  of  competitive  contrac¬ 
tors  who  submit  proposals  consisting  of  the  usual 
technical  volume  (i.e.,  background  information, 
statement  of  work,  technical  approach,  related 
experience,  facilities,  etc.).  Consider,  however, 
a  very  different  kind  of  management  and  cost  vol¬ 
ume  which  presents  a  predetermined  number  of  cost 
and  schedule  proposals  corresponding  to  at  least 
the  high  likelihood  paths  through  the  network. 

The  availability  of  tools  to  support  this  kind  of 
effort  is  assumed,  since  I  personal  ly  know  of  a 
large  number  of  computer  program  analysis  tools 
which  would  require  only  minor  modification  to 
provide  network  analysis,  path  generation  and  dis¬ 
play  capabilities.  These  tools  are  described  at 
length  in  the  literature  primarily  addressing 
subjects  of  automating  software  testing,  monitor¬ 
ing  and  measuring  the  thoroughness  of  testing, 
and  static  analysis  of  program  structure  [6,7,10, 


14,18,19,23,24,25,26,27].  One  might  expect  then 
that  we  could  readily  acquire  a  capability  to  mon¬ 
itor  project-program  operation  and  subsequently 
compare  expected-versus-actual  operational  experi¬ 
ence  with  life  cycle  costs. 

Perhaps  then  if  we: 

1)  develop  and  experiment  with  an  approach 

to  procurement,  program  planning,  cost  es¬ 
timation  and  performance  assessment  some¬ 
thing  like  that  described  here, 

2)  use  tools  to  maximum  advantage  to  support 
more  objective  yet  precise  consideration 
of  pertinent  cost  factors,  and 

3)  try  to  be  as  honest  as  possible  with  each 
other  about  what  all  this  means, 

then  I  expect  we  will  begin  to  gain  the  kind 
of  insight  needed  in  order  to  get  a  handle  on  and 
do  something  about  the  real  problems  relevant  to 
the  high  cost  of  software  development  [28,29,30]. 

It  is  possible  that,  with  practice,  using  the  cost 
modeling  technique,  we  could  get  pretty  good  at 
estimating  at  least  upper  and  lower  bounds  on  pro¬ 
ject  cost.  It's  possible  also  that  in  our  quest 
for  truly  improved  software  engineering  practice 
[29]  the  modeling  technique  can  help  us  give  more 
careful  attention  to  the  real  cost  payoffs  from 
modern  techniques  and  tools  and  further  promote 
their  judicious  and  cost  effective  application  to 
future  development  activities.  We  might  even  be 
able  to  stop  talking  about  the  good  old  days  as  if 
they  were  some  time  in  the  past. 
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ABSTRACT 

Some  of  the  problems  of  major  system  develop¬ 
ments  can  be  traced  to  the  lack  of  a  rigorously 
linked  chain  of  documents  connecting  the  operational 
needs  and  context  to  the  design  given  to  the  imple- 
menters.  A  method  of  constructing  such  a  rigorous 
linkage  using  SADT©  models  is  outlined  in  this 
paper. 


P_R0B_L_E_M 

Various  problems  have  plagued  the  development 
of  'rrge  military  systems.  Among  these  problems 
is  that  the  delivered  system  is  not  what  the  user 
expected  and  is,  in  fact,  not  very  useful,  even 
though,  in  some  sense,  it  works.  Another  problem 
is  that  requirements  seem  to  chanqe  rapidly  during 
development,  either  increasing  cost  and  slipping 
the  schedule  or  decreasing  the  usability  of  the 
product.  The  Department  of  Defense  has  been  trying 
to  address  these  problems  with  new  regulations 
(CA-109,  5000.1,  5000.2,  5000.3,  etc.);  these  regula¬ 
tions  will  improve  things.  Some  important  holes 
in  the  management  process  remain,  however. 

The  approach  to  technical  program  management 
taken  in  this  paper  assumes  some  of  the  troubles  in 
major  system  acquisitions  are  due  to  the  following 
causes: 

1.  Something  gets  lost  in  the  t  snslation  from 
stage  to  stage  in  the  development  process.  "Noise" 
accumulates  until  the  product  differs  appreciably 
from  the  user's  original  version  of  it.  Current 
military  documentation  regulations  do  not  prevent 
this,  because  they  specify  form  rather  than  sub¬ 
stance. 

2.  Many  of  what  are  called  requirements  changes 
are  in  fact  only  requirements  document  changes 
made  necessary  by  the  continual  uncovering  of  old, 
unchanged  requirements  not  in  the  document  because 
the  requirements  definition  process  was  inadequate. 

m  i2] 


APPROACH 

The  two  causes  of  the  problem  are  addressed 
by  producing  a  chain  of  documents  that  link  the 
system  user's  conception  of  the  system  to  an 
abstract,  implementation-free  specification  of  the 
system  to  be  given  to  the  implementer.  The  docu¬ 
ments  constrain  the  implementer  to  produce  just 
what  the  user  needs,  but  leave  him  free  to  trade 
off  the  alternative  ways  of  realizing  the  system. 

For  this  approach  to  succeed,  the  following 
must  be  met: 

1.  The  documents  must  be  formally,  rigorously 
1  inked  one  to  the  next. 

2.  For  each  document  in  the  chain,  one  must  be 
able  to  decide  what  information  is  or  is  not 
supposed  to  be  in  it. 

3.  These  documents  must  relate  to  the  things 
controlled  in  the  major  system  acquisition  process 
that  the  DoD  uses. 

The  documents  proposed  here  are  a  linked  set 
of  SADT  models. | 3)  Before  these  documents  are 
described,  it  is  necessary  to  provide  a  semiformal 
definition  of  the  terms  used  in  the  rest  of  the 
paper. 


nE_FINITIO_N_F_RAM£WOR_K 

A  r;..!-.  is  a  set  of  interacting  "phenomena" 
°f  "actual  reality."  (Quotes  indicate  primitive 
terms. ) 

-r i >.r  are  predicates  about  "phenomena." 
are  propositions  about  "phenomena." 

A  ;  is  a  relation  between  a  set  of  ;.<.>•:■■ 
and  a  set  of  .imiuiw. 
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A  mode!  of  a  system  is  a  relation  between  a  set 
of  questions  about  "phenomena"  of  the  model  and 
answers  about  "phenomena"  of  the  system  to  some 
tolerance. 

The  piti-pese  of  a  »...•./,•/  is  its  set  of  ;.■<•  'em’. 

In  the  definitions  that  follow,  by  model  it  is 
meant  a  model  produced  using  SADT.  [3J 

The  ci.-v  of  a  ••..■del  is  the  choice  of  parti¬ 
tioning  at  the  first  level  of  the  hierarchy.  The 
vie:.'  is  constrained  to  be  consistent  with  the 


The  vantage  point  of  a  mu.lt'.’  is  a  set  of  con¬ 
straints  placed  on  the  view  so  that  it  embodies 
the  abstractions  and  perceptions  of  a  certain  kind 
of  person.  (Not  only  must  the  view  satisfy  the 
•puri’ose ,  it  must  be  formed  of  pieces  a  person 
recognizes. ) 

A  :«»,'>•  is  a  vantage  point  of  a  person  who 
interacts  with  the  o;. s':.--  as  it  exists  in  "actual 
real ity. " 

A  functional  model  is  a  model  of  a  system 

from  the  view  of  a  user  of  that  system. 

An  activ.it  ion  rule  (of  an  SADT  diagram)  is  a 
logical  expression  stating  the  relation  among 
input,  control,  and  output  arrows. [4]  The  activa¬ 
tion  rule  states  the  conditions  (presence  or 
absence  of  data)  on  the  input,  control,  and  output 
arrows  such  that  presence  of  data  on  the  output 
arrows  will  occur.  Activation  rules  transform 
SADT  diagrams  into  finite-state  machine  descriptions. 

An  act  f  vat  in  is  a  path  traced  through  a  mode: 
from  its  external  input  and  control  arrows  to  its 
external  output  arrows,  according  to  the  activation 
rules  controlling  each  diagram  of  the  nod.-l. 

A  soen.tr ic  is  a  set  of  values  applied  to  the 
external  arrows  of  a  mode!  before  tracing  an 

activation  of  the  rode!. 

A  concept  of  operation  of  a  system  is  a  mode! 
containing  a  function. ti  rode!  of  that  system  along 
with  fund  ion.it  models  of  systems  it  interacts 
with.  A  oonory t  of  operations  specifies  the  set  of 
all  valid  activations  of  a  system. 

An  i 'sr.trq.de  of  a  concept  of  .  ;  .  rations  is  an 
activation  of  a  conoop!  of  opora t ions  for  some 
seen.  trio. 

A  lesianer  is  a  vantage  joint  of  a  person  who 
choses  at  each  hierarchical  level  of  detail  of  a 
model  a  partitioning  that  results  in  the  least 
coupled  pieces. 

A  design  model  of  a  system  is  a  funotion.il 

mode!  of  that  system  repartitioned  from  the  vuu 

of  a  designer.  The  functional  model  maps  into,  not 


onto  the  design  node!,  since  additional  information 
about  the  system  may  be  incorporated  in  the 

nod.  !. 

A  s: .  e!  float  !c>.  is  a  set  of  act  hut:.  n 

applied  to  the  i.sijn  n.d .  !  to  constrain  its  set 
of  possible  act  iva:  i.  ns  to  be  in  the  bounds  descri¬ 
bed  in  the  function.:!  m,  del.  Some  of  these  con¬ 
straints  are  necessary  to  control  the  effects  of 
the  additional  information  the  i.sign  m  :■ 
contains. 

THE  CHAIN  Of  DOCUMENTS 


The  chain  consists  of  the  following  documents: 

Concept  of  operations,  functional  model, 
design  model,  specification,  as  defined  in  the 
previous  section. 

The  conc.'i  t  of  01  er.it ions  describes  how  a  new 
system  will  be  used,  together  with  the  existing 
resources  and  weapons,  to  get  a  net  increase  in 
the  overall  effectiveness  of  the  entire  set  of 
resources.  Considered  by  itself,  only  system 
performance  can  be  measured.  Considered  in  its  con¬ 
text,  system  effectiveness  can  be  measured.  The 
concept  of  operations  provides  a  purpose,  view, 
and  context  for  a  functional  model.  It  binds 
the  document  chain  to  the  operational  context. 

The  functioned  node!  describes  how  the  sys¬ 
tem  must  behave  and  what  it  must  be  able  to  do 
for  its  user.  The  user’s  requirements  for  speed, 
accuracy,  size,  etc.  are  documented  in  the 
functional  model.  Later  this  document  will  be 
used  by  the  implementer  to  understand  the  user's 
"utility  function"  so  implementation  trade-offs 
can  be  made  to  satisfy  the  user  (rather  than 
vague  ideas  of  efficiency).  The  functional  model 
binds^  the  problem  description  to  the  operational 
context. 

The  design  model  describes  the  modular  struc¬ 
ture  (logical  structure)  of  the  system  to  be  built. 
As  defined  above,  it  is  the  "simplest"  description 
of  the  system  in  terms  of  Constantines  "structured 
design."  [ 5 ] , [ 6 ] , 1 7 ]  The  functions  in  this  model 
are  allocated  to  hardware  or  software  according 
to  the  trades  made  by  the  system  engineers.  The 
structure  constrains,  but  does  not  specify,  the 
algorithms  to  be  used.  This  model  binds  problem 
structure  to  solution  structure. 

The  specification  chooses  the  logic  or  algor¬ 
ithms  of  the  system  to  get  the  behavior  required 
by  the  functional  model.  The  specification  binds 
solution  behavior  to  problem  behavior. 

The  design  model  and  specification  do  not 
include  any  properties  of  the  hardware  or  soft¬ 
ware  used  to  implement  the  system.  They  do  con¬ 
strain  the  implementer  to  produce  a  system  of  a 
certain  structure  and  behavior,  and  guide  trade¬ 
offs  among  alternatives  with  a  utility  function 
from  the  functional  model. 
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PROBLEM  SOLUTION  IMPLEMENTATION 
OESCRI-  DESCRI-  DESCRIPTION 


PTION 

PTION 

Required  Capability 

why 

Concept  of  Operation 

what 

why 

Functional  Model 

how 

what 

why 

Design  Model 

how 

what 

Specification 

how 

LINKS  IN  THE  CHAIN 


For  the  documents  to  form  a  chain,  they  must 
be  linked.  These  links  are  made  in  the  SADT 
models  forming  the  chain.  The  SADT  syntax  pro¬ 
vides  a  way  of  formally  linking  the  models. 

The  concept  of  operations  contains  the 
functional  model  embedded  within  it.  The  func¬ 
tional  model  is,  in  a  sense,  a  named  subset  of 
the  concept  of  operations,  drawn  from  another 
view.  The  design  model  is  linked  to  the  functional 
model  using  the  SADT  mechanism  syntax.  Each  func¬ 
tion  in  the  functional  model  contains  a  mechanism 
"call"  reference  to  the  portion  of  the  design 
model  that  realizes  that  function.  The  desiqn 
model  can,  if  desired,  be  annotated  with  SADT 
support  arrows  which  relate  portions  of  the 
design  back  to  the  functions  in  the  functional 
model.  Thus,  requirements  traceability  back 
and  forth  from  functional  model  to  design  is 
maintained. 

The  specification  is  an  annotation  of  the 
design  model.  Thus  no  cracks  between  the  models 
that  accumulate  "noise"  are  permitted. 


The  argument  to  this  point  has  relied  some¬ 
what  on  the  properties  of  SADT  models.  However, 
the  concept  is  more  general  than  that.  It  should 
work  for  a  modeling  technique  with  the  following 
properties: 

1.  Describes  modular  structure,  hierarchically. 

2.  Provides  for  traceability  from  model  to 
model . 

3.  Contains  a  way  of  specifying  activations 
in  terms  of  output  value  "events"  under  some 
input  value  "conditions." 

SADT  has  been  used  because  it  is  a  natural  lan¬ 
guage  for  describing  system  structure  hierarchi¬ 
cally.  At  each  level,  the  pieces  and  their  rela¬ 
tions  (dependencies)  are  expressed.  The  mechan¬ 
ism  syntax  allows  traceability  between  models. 
Activation  rules  directly  transform  SADT  models 
into  finite-state  machine  descriptions.  And, 
the  language  is  a  simple,  graphic  one. 

As  an  illustration  of  how  the  mechanism  nota¬ 
tion  can  connect  one  model  to  the  next,  consider 
the  two  SADT  diagrams  "communicate"  and  "handle 
medium."  Each  diagram  is  the  first  level  decom¬ 
position  of  a  different  model. 

Box  3  on  "communicate"  has  a  downward  point¬ 
ing  mechanism  arrow  that  calls  the  AO  diagram  of 
model  CHM.  Thus  the  "Handle  Medium"  diagram 
realizes  a  box  in  another  model. 


ADEQUACY  OF  THE  MODELS 


A  MODEL  OF  THE  CHAIN  OF  MODELS 


To  evaluate  the  adequacy  of  a  model  in  the 
chain,  each  model  is  associated  with  a  development 
test,  as  follows: 

MODEL  TEST  MEANING 

concept  of  exercise  Is  it  useful? 

operations 

functional  acceptance  Does  it  work? 

design  integration  Do  the  pieces 

go  together? 

specification  verification  Are  the  pieces 

correct? 

A  model  is  adequate  if  it  can  be  used  to 
develop  a  plan  for  its  corresponding  test.  For 
users,  the  concept  of  operations  is  sufficient  if 
it  can  be  used  to  outline  an  exercise  that  uses 
the  new  system.  More  requirements  are  forced  out 
early  in  the  process  by  consideration  of  details 
of  the  test. 

This  criterion  also  suggests  what  should  be 
excluded  from  models.  Information  beyond  that 
necessary  to  construct  the  test  should  be  viewed 
with  suspicion;  it  may  be  merely  extraneous,  it 
may  belong  in  a  different  model,  and  its  inclusion 
at  this  level  may  overconstrain  the  next  level's 
model.  Not  everything  you  know  about  a  system 
needs  to  be  in  the  next  document  you  write  about 
the  system. 


The  SADT  diagrams  of  this  section  present  a 
model  of  a  Naval  system  development  from  the  view¬ 
point  of  a  program  manager.  The  diagrams  are 
annotated  to  show  where  the  documents  of  the  chain 
can  be  used  to  help  satisfy  the  requirements  of 
DSARC  reviews  that  lead  to  formal  milestone 
decisions. 

Each  document  of  the  chain  appears  at  a  dif¬ 
ferent  level  of  detail  in  the  model.  The  concept 
of  operations  appears  at  the  first  level  on  the 
AO  diagram.  The  functional  model  appears  at  the 
second  level  on  the  A3  diagram,  and  so  on.  All 
of  these  diagrams  (AO,  A3,  A33)  that  describe  a 
document  in  the  chain  have  the  same  form;  each  is 
related  to  its  corresponding  test.  Diagram  A1 
corresponds  to  the  concept  formulation  stage  in  a 
major  system  acquisition. 

There  are  three  reasons  for  including  this 
model:  it  gives  a  sample  of  an  SADT  model  for 
those  unfamiliar  with  the  method,  it  provides  a 
scheme  for  delegation  of  design  authority,  and  it 
provides  a  framework  to  discuss  the  order  in  which 
the  documents  are  produced  in  a  real  development. 

The  method  of  delegation  of  design  authority 
embodied  in  the  model  was  inspired  by  Cowen  (81. 

At  the  operational  level,  "Build,"  Box  3,  is  uelega 
ted,  while  authority  for  the  other  functions  is 
retained.  Box  3  on  A3  and  A33  is  similarly  delega 
ted.  Each  level  is  responsible  for  a  model  in  the 
chain.  Each  delegation  is  accompanied  by  a  con¬ 
straint  to  similiarly  delegate  and  constrain. 
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Responsibilities  work  out  like  this: 


concept  of  operations  ^ _ 

functional  model  - 

design  model  << — — - - 

specification  ^ - N 

implementation  < - ,  ' 

w?er  pr09rara 
unit  test  ^ _ i '  designer 

verification  <; - 1 

integration  test^ - 

acceptance  test  < - 

exercise  < - - - — 


j  analyst 
system 

engineer  operators, 
users 

l  I 


The  model  shows  the  dependencies  of  a  set  of 
activities  and  the  documents  they  produce.  It 
constrains,  but  does  not  specify,  the  exact  order 
in  which  the  development  activities  are  carried 
out.  For  example,  there  are  feedback  loops  on 
diagram  AO  that  may  cause  activities  to  be 


repeated,  depending  on  how  feasibilities  work 
out.  Opportunities  for  subtle  activation  paths 
abound.  On  Diagram  A1 ,  Box  3  has  a  mechanism 
support  arrow  that  indicates  Diagram  A3  (Build) 
may  be  invoked  in  the  course  of  evaluating  costs 
and  effectiveness.  This  could  arise  if  it  becomes 
necessary  to  build  a  feasibility  demonstration 
of  some  risky  part  of  the  system  to  fully  evaluate 
an  alternative. 

The  model  allows  many  alternate  activation 
paths  and  dynamic  behavior  patterns  in  system 
development;  however  the  documents  produced  at  the 
end  are  constrained  to  relate  to  each  other  in  a 
particular  way. 

in  practice,  an  activation  of  this  model  may 
begin  with  a  functional  model  rather  than  a  concept 
of  operations.  Technologists  may  propose  a  functiona 
model  of  a  new  kind  of  system  to  find  a  use  for  a 
new  technology.  A  concept  of  operations  would  be 
created  for  it,  along  with  concepts  of  operations  for 
competing  alternatives.  A  revision  of  the  external 
details  of  the  functional  model  would  likely  result, 
and  a  new  functional  model  would  then  be  created  in 
depth. 


1 

f  /VM/V rv//>6  1  K!  ij 

K»<\trAf*i4//Yr  5  t  f"1  /V 

1  ~ 

i  1 

>'Ktp  CAPAfttl/ r/t  -  <} 

C*f  t'r*t>nr  (  t)  ('W/S'nuv  titnjt/rr 
'  /  A  /f  Mf  t*  *  <Af//v*)) 

CuKvervr  rtAvy  £ 
CvoivutG  rtcHHoiOt-’ V 

of  vt  t  or 

nf  u< 

.svi/f'*) 

*in: 

—  ?  -  . —  - ► 

^  fs  AKt.F  rv£u>  •y'rsrfn 

f’rscr,f,r  ,/.+{ 't'f'irnttifion  «•»//*//*’»  £  ,nfnr mrtl'Qft'  /viv/n  /i 
necessary  to  manage  h'f't'f nt  of  nrtr  ^ ' 

1  viCn>r>mr*r:  f'rogram  frlc/nagcr 

Adding  nr  tv  f  <>  /y /ivy  >  to/f  nt  At  y  ueh,r»c  ,j 

retj,  j  h,i,  ,y 

NOOr  TITl  c 

f>C ,  V  0  pnciWe/vrA  T /'HA,* 

TKlllMllf  » 

1  . 

58 


59 


Cl  CZ 


^ARDUiAGE/ 
jOFTUiAKC 
PAHTl  T/ON‘f<iCr 


-1 

[Tint 

DC/A33I 

bestGH  t 

■  | 

\  mp  i  ' 

CONCLUSIONS 

By  establishing  a  rigorously  linked  chain  of 
documents  connecting  requirements  in  their  opera¬ 
tional  context  to  a  specification  of  what  the 
imp) ementer  must  build,  the  accumulation  of  noise 
in  the  process  is  prevented. 

By  introducing  a  test  for  the  adequacy  of 
each  model,  more  requirements  are  forced  out 
early  in  the  process,  and  attention  is  focused 
earlier  on  feasibility  issues. 

Inis  method  has  not  been  tried.  However, 

SADT  has  been  used  to  construct  a  Concept  of 
Operations.  A  functional  model  has  been  built 
and  transformed  into  a  design  model  [9],  As  yet, 
no  design  model  has  been  completely  turned  into 
a  specification  using  activation  rules;  for 
various  reasons  other  specification  languages 
(e.g.  programing  languages),  have  been  used. 

The  author  intends  to  try  the  technique  on  a 
software  development  in  the  near  future. 
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Abstract 


One  of  the  most  difficult  aspects  of 
software  development  is  transforming  the  systems 
analyst's  specifications  into  process  specifi¬ 
cations  that  can  be  understood  and  carried  out 
by  a  computer.  We  describe  a  method  in  this 
paper  that  can  be  used  to  reduce  the  complexity 
of  the  programmer's  task  and  to  ensure  a  higher 
degree  of  correlation  between  the  systems 
analyst's  specifications  and  the  program 
produced  for  the  application. 


Currently,  the  programmer  has  the  task  of 
interpreting  and  translating  the  user's  require¬ 
ments  into  specifications  that  can  be  understood 
by  the  computer.  As  a  result,  a  common 
phenomenon  associated  with  each  implementation 
of  a  system  is  that  the  user's  requirements  have 
not  been  satisfied.  To  begin  to  solve  this 
problem,  we  propose  that  the  complexity  of  the 
programming  should  be  reduced.  This  will  reduce 
the  number  of  errors  committed  by  programmers 
and  will  allow  what  the  systems  analyst 
specifies  to  be  more  accurate. 

Ideally,  we  are  striving  towards  the 
elimination  of  programming  in  the  form  that  it 
currently  exists.  In  this  paper,  we  describe 
one  method  that  we  are  trying  to  apply  as  a 
means  to  this  end.  The  basic  ideas  of  our 
methodology  have  been  taken  from  [1],  More 
research  is  needed  to  assess  its  potential. 

We  will  describe  our  methodology  In  terms 
of  an  example.  Then,  we  will  summarize  the 
characteristics  of  our  approach. 

Our  example  is  a  bank  transaction  applica¬ 
tion.  Specifically,  the  processing  of  a 
universal  form  for  deposit,  withdrawal  and 
transfer  of  funds  between  accounts.  The  Input 
fields  are: 

1.  date 

2.  bank  office 

3.  teller 

4.  type  of  transaction 

5.  account  number (s) 


The  systems  analyst  usually  specifies  the 
operations  to  be  performed  on  each  data  field. 

An  example  of  the  specifications  are: 

1.  Date 

a.  check  whether  date  is  current 

b.  calculate  interest  if  not  already 
done  for  previous  period (s) 

2 .  Bank  Office 

a.  check  whether  valid 

3.  Teller 

a.  check  whether  teller  number  is 
valid  for  that  office 

4.  Type  of  Transaction 

a.  deposit  into  accountl 

b.  withdrawal  from  accountl  -  check 
whether  balance  >  0 

c.  transfer  from  accountl  to  account2 
if  balance  in  accountl  >  0 

3.  Accountl 

a.  check  whether  still  active 

6.  Account2 

a.  check  whether  still  active. 

An  important  characteristic  of  the  systems 
analyst's  specifications  is  that  they  are  usually 
non-procedural  and  are  centered  around  the  data. 
The  programmer,  however,  has  the  duty  of  taking 
these  non— procedural  statements  and  transforming 
them  into  a  program  -  i.e.,  something  that  is. 
proce<  iral  and  process-oriented.  There  is  much 
decision  making  that  the  programmer  must  do  to 
create  a  process  for  the  computer.  Specifically, 
he  must  decide: 

1.  In  what  order  the  fields  should 

be  processed. 

2.  What  are  the  controls  of  instruction 

execution . 
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3.  What  must  be  synchronized  and  what  are 
the  conditions  for  synchronization  if 
parallel  processing  is  used. 

Our  methodology  allows  the  programmer  to 
create  independent  portions  of  a  program  and  uses 
a  translator  to  produce  a  program  which  inter¬ 
relates  these  independent  program  protions.  For 
each  field  of  the  input  data,  using  the  systems 
analyst’s  specifications,  we  have  a  graphic 
representation: 


FIGURE  1 


Independent  Program  Portions 


FIGURE  2 


Relationship  Between  the  Program  Portions 

The  operations  in  fields  1,  2,  b ,  and  6  can 
begin  to  be  executed  in  parallel  since  the  opera¬ 
tions  on  these  fields  begin  with  intra-field 
relationships.  Field  3  begins  with  an  integrity 
check  that  involves  an  inter-field  relationship. 
Hence,  2a  must  be  completed  before  3a  begins. 
Operations  4a,  4b,  and  4c  must  wait  for  all  other 
processing  and  integrity  checks  to  have  been 
completed . 


A  node  with  a  number  represents  the  fact 
that  the  succeeding  instructions  should  be. execu¬ 
ted  to  process  that  field.  A  node  with  a  letter 
means  that  the  instruction (s)  associated  with 
that  node  should  be  executed.  For  instance,  the 
left-most  graph  represents  the  fact  that  two 
operations  ("a”  -  check  whether  date  is  current 
and  "b”  -  calculate  interest  if  not  already  done 
for  previous  period (s))  should  be  executed  to 
process  the  "date"  field.  First,  the  instruc¬ 
tion^)  for  "a"  should  be  executed  and  then  the 
instruction(s)  for  "b".  Since  only  one  of  "a", 
"b",  or  "c”  is  executed  for  field  4,  these  three 
nodes  are  not  represented  in  sequential  fashion. 

What  we  have  represented  in  Figure  1  are 
data  flow  graphs  for  each  input  data  field.  Each 
node  of  the  graph  represents  an  instruction  or  a 
sequence  of  instructions.  All  instructions 
that  operate  on  the  same  field  are  chained 
together. 

At  this  point,  the  specification  step  has 
been  completed.  We  still  have  not  specified, 
though,  what  fields  can  be  processed  in  parallel 
and  what  synchronization  events  are  needed.  In 
existing  design  and  implementation  methodologies 
these  are  defined  in  the  design  stage  or  as  late 
as  the  time  of  coding. 

The  graph  below  represents,  in  terms  of  data 
flow,  the  operations  necessary  to  check  data 
integrity,  to  process  the  data  plus  which 
Instructions  can  be  processed  in  parallel  and 
which  operations  must  be  synchronized. 


A  translator  can  produce  a  program  equivalent 
to  the  graph  in  Figure  2  given  the  instructions 
represented  in  Figure  1.  Instructions  for  each 
data  field  can  be  executed  in  parallel  until 
another  data  field  is  referenced.  When  an  instruc¬ 
tion  references  another  data  field,  all  previous 
instructions  that  reference  that  data  field  must 
have  finished  execution. 

In  summary,  we  need  to  bridge  the  gap  between 
systems  analysis  and  programming.  We  must  look 
for  methods  that  foster  (semi)  automatic  trans¬ 
lation  between  the  systems  analyst’s  specifica¬ 
tions  and  the  program  code. 

Our  methodology  assumes  that  the  modules  of 
the  system  have  been  identified  in  the  design  stage 
and  that  the  data  items  to  be  operated  on  in  each 
module  are  known.  For  each  data  item,  the  t.  ms 
analyst  then  specifies  what  are  the  op°«  io 
be  performed.  The  programmer  then  wrn  .i  .• 
dent  program  portions  for  each  data  i,  A  u  s- 
lator  is  then  used  to  produce  the  program  which 
interrelates  the  program  portions.  The  end  result 
is: 

1.  closer  interaction  between  the  systems 
analyst  and  the  programmer  since  the 
systems  analyst  writes  his  specifica¬ 
tions  after  the  design  phase 

2.  reduction  of  decision-making  responsi¬ 
bilities  for  programmers;  specifically 
he  does  not  decide  on  the  sequencing  of 
instructions  on  different  data  Items  nor 
on  any  synchronization  conditions. 
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3.  the  design  of  each  module  Is  based 
upon  its  data  items. 


Our  methodology  can  alleviate  some  of  the 
inconsistencies  between  the  user's  requirements 
and  the  programmer’s  end  product.  Programming 
is  still  necessary  but  it  has  been  reduced  in  its 
complexity.  We  anticipate  future  research  by  us 
and  others  to  further  reduce  the  complexity  of 
programming  and  to  change  it  from  a  process- 
oriented  activity.  We  hope  that  in  the  near 
future,  the  process-oriented  method  of  programming 
will  become  obsolete. 
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The  system  development  life  cycle  is  the  central 
concept  in  currently-used  methods  of  managing  and 
controlling  the  determination  of  information  re¬ 
quirements  and  designing  and  implementing  process¬ 
ing  systems  to  meet  those  requirements.  When  or¬ 
ganizations  specify  the  use  of  formal  life-cycle- 
based  methods  for  all  application  developments,  the 
results  are  mixed.  The  method  may  be  helpful,  but 
in  many  cases,  it  may  be  detrimental. 

A  single  life-cycle  method  is  not  appropriate 
for  all  cases  because  applications  differ  in  the 
certainty  with  which  requirements  can  be  estab¬ 
lished.  This  paper  describes  the  contingencies 
which  define  the  uncertainties  in  the  determina¬ 
tion  of  information  requirements  and  describes  al¬ 
ternative  strategies  for  information  requirements 
determination  given  different  levels  of  uncertain¬ 
ty  about  information  requirements.  The  methods 
are  no  cycle,  linear  life  cycle,  recursive  life 
cycle,  and  prototype. 


Introduction 

Formal  life  cycle  development  methodologies  are 
not  consistently  reflected  in  practice;  concepts 
such  as  throw-away  code,  prototype  systems,  and 
recursive  life  cycles  are  receiving  increasing 
attention  in  industry.  The  profession  has  de¬ 
veloped  a  paradox:  the  life  cycle  procedures  when 
carefully  followed  provide  a  high  degree  of  assur¬ 
ance  of  project  success  for  many  systems,  but  they 
may  be  prohibitively  expensive  andunwieldly  for 
many  others.  This  suggests  that  there  may  be  dif¬ 
ferent  methods  for  different  projects. 

A  contingency  theory  is  a  theory  which  identi¬ 
fies  alternative  actions  and  presents  factors  to 
use  in  selecting  the  optimal  alternative.  For 

example,  McFarlan^  proposes  a  cent ingency  theory 
for  development  project  management.  He  identifies 
project  size,  degree  of  structuredness,  and  degree 
of  company-relative  technology  as  factors  which 
determine  the  best  planning  and  control  tool  for 
a  project. 
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Uncertainty  has  been  identified  by  Galbraith 
as  a  major  factor  in  determining  the  optimal  or¬ 
ganization  structure.  The  difference  between  the 
amount  of  information  necessary  to  perform  a  task 
and  the  amount  of  information  possessed  is  a  measure 


of  task  uncertainty;  organizations  respond  by 
choosing  from  a  set  of  four  organizational  strate¬ 
gies  to  deal  with  the  level  of  uncertainty. 

The  information  requirements  determination  por¬ 
tion  of  information  systems  development  can  be 
viewed  as  a  problem  in  uncertainty;  the  formal  life 
cycle  methodology  is  a  response  to  this  uncertain¬ 
ty.  Its  intent  is  to  make  sure  that  the  organiza¬ 
tion  does  enough  information  processing  (with  re¬ 
gard  to  requirements)  that  uncertainty  is  reduced 
to  acceptable  levels.  However,  since  the  informa¬ 
tion  requirements  uncertainty  differs  widely  for 
different  applications,  a  single  formal  life  cycle 
approach  is  incapable  of  addressing  the  full  range 
of  information  systems  development  projects. 

Contingencies 

In  the  determination  of  information  require¬ 
ments  for  an  information  system  application  uncer¬ 
tainty  refers  to  knowledge  of  the  "real"  informa¬ 
tion  needs.  Development  contingencies  which  de¬ 
termine  information  requirements  uncertainty  are 
project  size,  degree  of  structuredness,  user-task 
comprehension,  and  developer-task  proficiency.  A 
systems  development  project  has  some  combination 
of  these  attributes  (and  perhaps  other  as  yet 
unspecified  contingencies).  The  combination  of 
contingencies  determines  the  choice  of  development 
methodology. 

Project  Size 

The  project  size  contingency  has  three  key 
characteristics:  duration,  number  of  people  in¬ 
volved,  and  total  dollar  amount.  These  character¬ 
istics  are  usually,  but  not  necessarily,  collinear. 
That  is,  a  high  cost  project  usually  requires  many 
people  over  an  extended  time  period.  Project  size 
is  not  a  good  measure  of  the  value  of  a  systems 
development  project,  but  it  is  correlated  with  the 
degree  of  uncertainty  of  the  results  of  the  devel¬ 
opment  process. 

Degree  of  Structuredness 

One  dimension  of  the  Gorry  and  Scott  Morton 
(1971)  framework  for  information  systems  is  that 
of  the  relative  st rueturedness  of  the  decisions  to 

he  supported  by  an  information  system.  For  in¬ 
formation  systems  information  requirements  deter¬ 
mination,  a  high  degree  of  structuredness  means 
that  a  general  model  exists  which  needs  onlv  to  be 
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applied  to  the  given  organizational  setting.  A  low 
degree  of  structuredness  means  that  there  is  no  rou¬ 
tine  procedure  for  dealing  with  the  problem,  there 
is  ambiguity  in  the  problem  definition  and  uncer¬ 
tainty  as  to  the  criterion  for  evaluating  solutions. 
Uncertainty  about  the  decision  to  be  supported  is 
an  important  factor  in  uncertainty  about  the  out¬ 
come  of  the  systems  development  process. 

User  Task  Comprehension 

Related  to  but  distinct  from  structuredness  is 
the  comprehension  that  the  user  or  users  have  of 
the  task  to  be  performed  by  the  information  system. 
User  task  comprehension  affects  the  strategy  and 
development  project  success  in  much  the  same  way  as 
degree  of  structuredness.  If  the  users  have  a  low 
degree  of  understanding  of  the  task  for  which  the 
system  is  intended,  whether  or  not  a  general  model 
of  a  problem  exists,  less  is  certain  about  the  in¬ 
formation  requirements  (and  the  users'  acceptance 
of  the  results  of  the  development  process) . 

Developer  Task  Proficiency 

Developer  task  proficiency  is  a  measure  of  the 
specific  training  and  experience  brought  to  the 
project  by  the  development  staff:  project  manager, 
liaison  staff,  systems  analysts,  systems  designers, 
programmers,  etc.  It  is  not  a  measure  of  ability 
or  potential:  rather  it  is  a  measure  of  directly 
applicable  experience.  This  contingency  indicates 
the  degree  ol  certainty  with  which  the  developer 
will  be  able  to  obtain  and  document  the  require¬ 
ments  (and  also  proceed  with  the  remainder  of  the 
development  process) . 

Uncertainty-Reducing  Strategies 

The  response  to  uncertainty  produced  by  charac¬ 
teristics  of  a  systems  development  task,  the  using 
organization,  and  the  developer  organization  (i.e., 
the  contingencies)  has  frequently  been  unidimen- 
tional.  Under  the  traditional  life  cycle  approach, 
formal  procedures,  reviews,  committees,  check 
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points,  etc.  are  used  for  all  projects.  There  has 
been  10  recognition  of  the  degree  of  uncertainty 
from  the  contingencies.  An  alternative  approach 
Is  to : 

1.  Identify  the  contingencies  and  determine  the 
uncertainty, 

2.  Select  an  information  requirements  determi¬ 
nation  method  suitable  for  the  level  of  uncertainty. 
Figure  1  shows  the  relationship  of  uncertainty- 
reducing  strategies  to  level  of  uncertainty,  and 
suggests  methodologies  which  are  actualizations  of 
these  strategies.  The  strategies  are :  accept  infor¬ 
mation  requirements  as  specified,  linear  discovery, 
recursive  discovery,  or  experimental  discovery  of 
information  requirements.  Each  has  an  associated 
method . 

Accept  as  Sp  ’cified 

If  information  requirements  are  known  and 
agreed  upon,  then  the  proper  strategy  is  to  accept 
the  user's  statement  of  need  as  adequate  specifica¬ 
tion  for  implementation.  The  method  is  therefore 


to  have  no  information  requirements  cycle.  Exam¬ 
ples  are  file  conversions,  reports  from  existing 
files  or  databases  and  small  single-user  models. 
These  examples  have  in  common:  small  size,  high 
degree  of  structure,  users  who  understand  what 
the  systems  are  to  do  and  how  the  implementation 
will  function,  and  developers  who  are  experienced 
in  this  kind  of  task.  Explicit  recognition  of  the 
need  for  the  "accept  as  specified"  strategy  will 
lead  to  greater  responsiveness  and  an  increase  in 
development  organization  efficiency. 

Linear  Discovery 

If  information  requirements  can  be  determined 
through  a  straight-forward  process  of  interviewing, 
fact  gathering,  and  documentation,  the  proper  stra¬ 
tegy  is  to  proceed  step-by-step  to  system  specifi¬ 
cation.  The  method  is  therefore  a  linear  applica¬ 
tion  of  the  life  cycle.  Examples  are  transaction 
level  systems,  single  function  accounting  systems 
such  as  accounts  receivable  or  payable,  and  minor 
modifications  to  existing  information  systems. 

The  information  requirements  for  large  systems 
which  are  highly  structured  and  where  user-task 
comprehension  and  developer-task  proficiency  are 
high  may  be  effectively  determined  by  the  linear 
discovery  process.  However,  information  require¬ 
ments  for  a  relatively  small  system  may  not  be 
determinable  by  this  method  if  the  decisions  to  be 
supported  are  relatively  unstructured,  or  if  the 
user  does  not  comprehend  the  task,  or  if  the  de¬ 
velopers  have  not  previously  produced  such  a  sys¬ 
tem.  Linear  application  of  the  life-cycle  model  is 
an  effective  strategy  under  the  appropriate  combi¬ 
nation  of  contingencies. 

Recursive  Discovery 

The  linear  discovery  strategy  may  not  produce 
correct  or  complete  or  acceptable  specifications 
of  information  system  requirements.  The  tradition¬ 
al  life-cycle  approach  extends  to  recursion  for 
such  systems.  One  or  more  discovery  tasks  are  it¬ 
erated  until  a  complete,  consistent  specification 
is  determined  and  accepted.  Examples  are  large, 
multiple-user  systems,  systems  which  are  new  to 
the  user  or  developer  organization,  and  systems 
which  support  the  relatively  unstructured  decisions 
of  tactical  and  strategic  management.  This  ap¬ 
proach  assumes  that  a  correct  specification  of  re¬ 
quirements  can  be  made  given  sufficient  time  and 
effort.  Where  the  contingencies  indicate  that  is 
a  valid  assumption,  the  recursive  discovery  strat¬ 
egy  is  appropriate  and  effective. 

Experimental  Discovery 

A  high  level  of  uncertainty  may  be  indicated  by 
a  combination  of  the  contingencies.  Repeated  it¬ 
erations  of  discovery  may  not  successfully  produce 
adequate  specif ications  of  information  require¬ 
ments  in  such  cases.  The  life  cycle  method, 
whether  linear  or  recursive,  is  inappropriate  when 
uncertainty  is  high.  The  strategy  of  experimental 
discovery  as  realized  in  the  prototype  design 
method,  reduces  uncertainty  by  producing  successive 
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approximations.  Users  and  developers  can  easily 
see  what  is  wrong  with  an  implementation  ever, 
though  they  are  unable  to  completely  specify  its 

information  requirements.^  Examples  of  appropri¬ 
ate  applications  of  the  experimental  discovery  ap¬ 
proach  are  decision  support  systems  for  upper  man¬ 
agement,  interactive  forecasting  models,  and  small 
(or  large)  systems  to  be  implemented  for  many  dif¬ 
ferent  users.  Conscious  selection  of  the  experi¬ 
mental  discovery  strategy  may  be  the  only  effective 
approach  to  information  requirements  determination 
when  the  level  of  uncertainty  is  high. 

Conclusion 

A  range  of  information  requirements  determina¬ 
tion  strategies  is  needed.  Such  strategies  match 
the  level  of  uncertainty  about  the  system  specifi¬ 
cation  which  is  to  result.  The  appropriate  strat¬ 
egy  is  determined  by  the  extant  contingencies. 

Research  is  needed  in  two  major  areas  of  the 
contingency  theory:  more  precise  operational  def¬ 
initions  of  the  contingencies,  and  identification 
and  definition  of  other  factors  which  effec*  the 
level  of  uncertainty,  should  he  demonstrated.  The 
range  of  discovery  strategies  and  corresponding 
information  requirements  determination  methods  may 
be  extended.  The  results  of  such  research  will 
lead  to  improved  selection  of  methodology,  and 
more  effective  systems  development. 
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Abstract 

A  new  model  for  the  software 
development  process  is  presented  in  which 
the  rate  of  progress  is  determined  by  the 
extent  to  which  the  system  under 
development  can  be  decomposed  into  modules 
which  permit  independent  development.  The 
pattern  of  resource  consumption  of  a 
project  over  its  life-cycle  can  be  derived 
from  this  model.  The  results  are  compared 
with  previous  predictions  of  a  Rayleigh 
curve  pattern. 


Introduction 

PUTNAM  and  NORDEN  (1),(2)  have 
proposed  a  life  cycle  model  for  software 
development  effort  which  predicts  that  the 
rate  of  resource  consumption  on  a  project 
should  have  a  Weibull  distribution  in 
time.  In  other  words  if  W(t)  is  the 
proportion  of  the  total  project  completed 
by  time  t  then  the  rate  of  progress  is 
give  by 

dW  -p(t) 

-  =  p*  (t)  .  e 

dt 


dW  -  }  p  (u)du 

-  -  .  dt  =  1  -  e 

!•  (D. 
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•  her  actually 
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dW 

-  =  P*  (t)  .  (  1  -  W(t)  ) 

dt  (2) 

from  which  equation  (1)  can  be  derived. 

The  function  p’(t)  defined  by  this 
expression  is  taken  to  be  a  "learning 
curve”  which  describes  how  the  development 
team  becomes  increasingly  skillful  in 
making  decisions  or  solving  problems 
associated  with  this  particular  project. 
Any  function  p' (t)  can  be  plugged  into  the 
model  and  will  generate  a  predicted 
life-cycle  pattern  belonging  to  the 
Weibull  family.  However  a  reasonable  fit 
with  empirical  data  from  actual  software 
development  projects  can  be  obtained  by 
choosing  a  linear  learning  function 
p' (t)  =  a.t  .  The  life-cycle  pattern 

resulting  from  this  choice  is  the  Rayleigh 
function 

2 

dW  -at  /2 

-  =  a.t.e 

dt 

The  general  shape  of  the  Rayleigh 
curve  matches  a  very  simple  intuitive 
picture  of  the  life-cycle  of  software. 
There  is  an  initial  design  phase  during 
which  the  rate  of  resource  consumption 
increases  steadily,  a  central  peak  of 
effort  corresponding  to  principal 
implementation  activity  and  a  maintenance 
phase  during  which  the  work  rate  decays 
back  to  zero.  These  phases  of  development 
may  be  identified  with  the  different 
assumptions  which  made  up  the  model  of  the 
development  process.  The  gradual  decay  of 
work  rate  during  maintenance  is  basically 
the  result  of  the  problem  space  becoming 
exhausted.  Since  it  was  assumed  that  the 
goal  of  the  software  project  was  some  task 
which  could  be  characterised  by  a  fixed 
finite  pool  of  problems  to  be  solved,  then 
as  the  work  progresses  this  pool  must 
eventually  empty.  This  must  cause  some 
form  of  exponential  slowing  down  of 
maintenance  effort.  However,  the  Rayleigh 
model's  explanation  of  the  initially 
growing  work  rate  in  terms  of  increasing 
skill  level  being  available  is  open  to 
several  criticisms. 
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Firstly,  a  skill  level  which  just 
grows  linearly  may  be  rather  unrealistic. 
Often  the  staff  joining  a  development 
project  will  have  worked  on  closely 
related  projects  before.  There  is  little 
reason  to  suppose  that  the  skill  of  these 
people  improves.  It  is  also  the  case  that 
programmers  remain  with  a  system  only  for 
a  portion  of  its  life-cycle.  In  that  case 
there  is  no  reason  for  those  who  join  late 
on  to  be  more  skillful  than  those  who 
started.  Taken  to  extremes,  when  a  system 
is  in  the  final  maintenance  stage  of  its 
life  and  probably  has  low  priority  it 
would  be  most  unlikely  to  attract  the  most 
expert  programmers  as  the  theory  suggests. 

Now  it  can  be  argued  that  linearity 
is  only  a  first  approximation  and  that 
these  objections  can  be  answered  by 
choosing  a  more  realistic  form  for  the 
learning  curve  p'  (t) .  However,  the  model 
as  presented  gives  little  guidance  as  to 
what  functions  for  p' (t)  would  be 
acceptable.  The  actual  property  being 
measured  by  p'(t)  is  not  sufficiently 
precisely  defined  to  allow  an  empirical 
approach.  Ideally  one  would  like  to 
measure  the  skill  level  on  some  actual 
projects  and  fit  a  curve  to  it.  In  some 
sense  the  Rayleigh  function  model  merely 
transforms  the  problem  rather  than  solving 
it;  the  difficulty  of  predicting  the 
pattern  of  resource  consumption  of  a 
project  is  exchanged  for  the  difficulty  of 
estimating  how  the  skill  level  changes. 

A  final  objection  is  that  the 
Rayleigh  model  fails  to  distinguish 
between  innate  constraints  on  the  process 
of  writing  software  and  management's 
economically  constrained  hiring  and 
methodology  decisions.  The  trouble  is  that 
the  level  of  skill  available  on  a  project 
depends  jointly  on  how  long  skilled 
personnel  have  been  assigned  to  it  and 
also  on  the  tools  and  design  methods 
which  they  use.  A  model  based  only  on 
skill  level  therefore  presents  a  picture 
of  software  development  as  being  an 
unalterable  process.  A  more  powerful 
theory  would  show  how  the  pattern  of 
development  may  be  altered  as  a  result  of 
the  introduction  of  new  software 
engineering  methods. 

This  paper  presents  an  alternative 
life-cycle  model  for  software  development 
which  is  based  much  more  closely  on  the 
nature  of  the  development  process.  The 
notion  of  skill  level  is  replaced  with  the 
idea  that  strong  modularity  in  a  program 
is  what  governs  the  rate  of  progress  since 
it  allows  work  by  several  teams  to  proceed 
in  parallel.  This  concept  clarifies  the 
role  of  project  management's  apparently 
free  choice  to  hire  staff  while  at  the 
same  time  suggesting  how  more  precise 


measurements  and  predictions  of  life-cycle 
behaviour  could  be  made. 

The  Programming  Process 

As  in  the  Rayleigh  curve  theory,  the 
new  model  will  explain  the  decay  in 
workrate  on  an  old  system  in  terms  of  the 
problem  space  becoming  exhausted.  For  this 
to  happen  the  software  must  be  intended  to 
provide  some  identifiable  service.  The 
following  principle  formalises  that 
concept . 

A 1 .  A  software  development  project  has 
associated  with  it  some  large  but  finite 
set  of  subtasks  to  be  completed  -  each  of 
which  is  either  a  decision  to  be  made  or  a 
problem  to  be  solved. 

Now  this  assumption  is  not  intended 
to  imply  that  the  size  of  the  pool  can  be 
accurately  predicted  in  practice  before 
development  work  begins.  It  may  even  be 
the  case  that  the  actual  design  chosen  for 
the  system  may  affect  the  size  of  the 
subtask  pool.  We  avoid  all  these 
difficulties  by  treating  the  subtasks  as 
infinitesimally  small  and  normalising  the 
total  amount  of  work  on  each  project  to  be 
one.  This  has  the  effect  of  focusing 
attention  on  the  pattern  of  resource 
consumption  rather  than  trying  to  estimate 
its  total  amount.  For  the  latter  task  one 
would  need  to  be  able  to  judge  the  quality 
of  acutal  system  designs. 

To  explain  the  increasing  workrate  in 
the  initial  stages  of  a  project  something 
more  detailed  than  a  prediction  of 
increasing  "skill"  level  is  required.  We 
claim  that  the  essential  mechanism  has  to 
do  with  how  much  work  may  be  carried  out 
in  parallel.  At  the  very  beginning  of  the 
project  only  a  small  team  of  analysts  can 
be  used  since  they  must  identify  the  basic 
functions  which  will  be  realised  by  the 
software.  The  initial  requirements  of 
design  team  has  only  limited  use  for 
computing  or  programmer  resources.  It  is 
only  after  the  major  functions  and  their 
interfaces  have  been  identified  that  these 
can  be  grouped  into  components  and  passed 
out  for  further  design  and  implementation. 
The  key  fact  is  that  if  the  initial 
decomposition  was  done  well,  each  of  the 
components  identified  can  be  developed  by 
separate  programmers  or  teams  of 
programmers  working  in  parallel.  Hence 
more  development  resources  can  be 

effectively  applied  and  a  faster  rate  of 
progress  achieved  as  a  result.  Thus 
although  management  is  free  to  follow  any 
hiring  policy,  if  the  system  is  to  be 
developed  as  fast  as  possible  for  a  given 
cost,  the  rate  of  progress  will  be 

increasing  during  development. 
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In  the  above  paragraph  design  of  a 
component  (specif ication  of  its 

input/output  behaviour)  was  taken  to 
always  precede  implementation.  On  a 
large-scale  project  it  may  not  be  easy  to 
classify  activities  in  that  way.  It  has 
already  been  convenient  to  formalize  the 
entire  development  task  as  a  homogeneous 
space  of  subtasks.  At  this  level  of 
abstraction  the  natural  way  to  indicate 
which  problems  may  be  tackled  in  parallel 
is  to  introduce  a  dependency  relation  R  on 
the  pool  of  subtasks. 

A2 .  There  is  some  partial  order  relation  R 
defined  on  the  subtasks  of  the  project 
such  that 

x  R<  y 

means  that  subtask  x  must  be  completed 
before  work  on  subtask  y  can  begin. 

Relation  R  has  to  be  a  partial  order 
because  if  there  were  some  circular  chain 
of  subtasks 

xl  R<  x2  R<  x3  . . .  R<  xl 
then  the  project  could  never  be 
completed. 

Progress  on  the  project  is  achieved 
by  successively  solving  the  subtasks 
defined  in  A1.  As  in  the  opening  section 
let  W (t)  denote  the  proportion  of  the 
project  completed  at  time  t  -  this  being 
the  ratio  of  the  number  of  problems 
solved  at  that  time  to  the  total  number  of 
problems  on  the  project.  Then  it  is 
immediate  that  W  should  have  value  zero 
when  the  project  starts,  one  when  it  ends 
and  increase  monotonically  in  between.  The 
rate  of  progress  on  the  project  is  dw/dt. 
Since  the  mechanism  for  achieving  rapid 
progress  is  to  work  in  parallel,  it  is 
reasonable  to  assume  that  more  development 
resources  are  needed. 

A3.  The  rate  at  which  a  software  project 
consumes  programming  and  computing 
resources  is  proportional  to  the  rate  of 
progress  being  made  namely  dW/dt. 

Now  at  time  t  the  amount  of  the 
project  still  unsolved  and  therefore 
needing  work  is  1  -  W(t).  But  in  general 

many  of  these  tasks  cannot  be  started  at 
time  t  because  the  R-relation  requires 
that  other  work  be  done  first.  Let  us  say 
that  a  problem  is  visible  at  time  t  if 
all  its  R-predecessors  have  been  solved  at 
that  time.  The  following  axiom  formalises 
parallelism  as  the  mechanism  of  rapid 
progress . 

A4 .  The  rate  at  which  a  software 
development  can  usefully  consume  resources 
in  proportional  to  the  number  of  subtasks 
which  are  visible  and  unsolved  at  that 
time . 

Let  V(t)  denote  the  proportion  of  the 


total  project  which  is  both  visible  and 
unsolved  at  tine  t.  Then  combining  axioms 
A3  and  A4  gives 

dW 

-  =  a.  V  (t) 

dt  (3) 

where  a  is  some  arbitrary  constant. 

The  most  interesting  part  of  this 
model  is  concerned  with  characterising  the 
rate  of  change  of  V(t).  Now  the  only 
mechanism  for  altering  V  is  to  complete  a 
subtask.  This  will  certainly  have  the 
effect  of  removing  one  problem  from  the 
visible-unsolved  set;  it  may  also  make 
some  undetermined  number  of  other  unsolved 
subtasks  visible.  Clearly  V  depends  on  the 
rate  at  which  problems  are  being  solved: 


dV 

dW 

/average 

change! 

- - 

- X 

in  V  for 

each  1 

dt 

dt 

^  problem 

solved/ 

The  actual  change  in  the  number  of 
visible  unsolved  subtasks  when 
some  problem  is  solved  depends  on  the 
particular  problem,  the  dependency 
relation  R  and  which  other  problems  have 
been  solved  at  that  time.  If  the 
particular  subtask  completed  has  no 
R-descendents  then  V  is  merely  decreasedby 
one  or,  more  correctly  since  we  are 
working  in  continuous  time,  dV/dt=-dW/dt . 
But  if  there  are  unsolved  R-descendants 
whose  other  R-ancestors  have  all  been 
solved  then  these  become  visible  and  V  may 
be  increased.  Clearly  it  is  impracticalto 
handle  each  case  explicitly  so  some 
averaging  is  needed. 

But  the  average  change  for  each 
problem  solved  £V(t)  cannot  be  constant 
over  the  life-cycle  of  the  project.  To 
explain  this  we  introduce  the  notion  of 
terminal  -  a  subtask  of  the  project  with 
no  R-descendants.  Then  terminals  must  be 
more  dense  towards  the  end  of  the 
life-cycle  than  at  the  beginning.  For  a 
terminal  corresponds  to  a  problem  which 
when  solved  opens  up  no  further  work.  If 
all  the  early  subtasks  in  a  project  were 
terminal  then  the  project  would  quickly 
come  to  an  end  with  no  further  work  to  be 
done.  Conversely  a  project  with  few 
terminal  subtasks  in  its  rater  phases 
would  never  reach  a  proper  maintenance 
mode.  So  one  must  expect  that  the  density 
of  terminals  will  increase  as  the  project 
progresses.  Consequently  £V(t)  should 
initially  be  positive  (most  problems  have 
several  R-descendants  which  become 
visible)  ar.d  gradually  turn  negative  as 
the  project  develops.  In  lact  at  the  very 
end  of  the  project  one  can  be  certain  that 
the  work  being  done  is  terminal  so  as  W(t) 
■>  +1  then  V(t)4-1. 
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A  simple  but  powerful  law  with  these 
properties  is 

a  (t)  =  c  -  fi.W  (t)  k 

where  b,c,d  are  arbitrary  constants.  In 
order  to  meet  the  limiting  requirement 
above  d=c-1.  This  leads  to 


dV 


dt 


dW 

dt 


(c+1) .W(t) 


(4) 


which  together  with  equation  (3)  defines 
the  life-cycle  pattern  predicted  by  the 
new  model  for  the  development  process. 

Analysis  of  the  Model 

Combining  equations  (3)  and  (4)  gives 


2 

d  W 


dt 


dW 


dt 


(1+c) . W (t) 


which  may 

be  integrated  as  it 

give 

dW 

f  1+c  b+ 1 j 

- -  a  1 

c.W  -  - ,W(t)  > 

dt 

l  1+b  1 

This  can  be  seen  to  have  the  solution 
1 

WW(t)  = 


[c+1 
(b+1 


-ac (b+1 ) 


+  e 


)c 


]  V(1 


+b) 

(5) 


The  function  defined  by  equation  (5)  has 
the  desired  property  that  as  t  becomes 
large  and  negative  W(t)  ^  0.  We  also 

require  that  the  total  amount  of  work  on 
the  project  is  1.  This  will  occur  if 
1+c=(1+b)c  which  is  solved  by  c=1/b.  Hence 
one  obtains  the  simpler  formula  for 
progress  in  development 


1 


W(t)  = 


1 


-a(b+1)t/b j  1/(b+i; 


and  the  corresponding  pattern  of  resource 
consumption  or  workrate  is  found  by 
dif  f erentiating : 


dW  a  -pt 

- =  -  .  e 

dt  b 


•[’-1 


\  1+b  / 


where  p  =  a(b+1)/b 
Comments  and  Conclusions 


derived  above  is  a  smooth  bell-shaped 
curve.  If  parameter  b  is  chosen  with  a 
value  between  one  and  zero  then  the  curve 

is  skewed  to  the  right  and  becomes  quite 

similar  in  appearance  to  the  Rayleigh 
function.  It  can  be  argued  that  this  • 
parameter  is  a  measure  of  the  extent  to 
which  a  structured  programming  methodology 
was  used.  For  large  b  values  give 

dependency  relations  R  with  the  terminals 
particularly  strongly  clustered  toward  the 
end.  But  the  main  tenet  of  improved 

development  methodology  is  exactly  to 
force  the  early  decisions  to  be  structural 
ones  which  define  modules  for  subsequent 
implementation;  details  of  low  level 
implementation  should  be  considered  only 
later  in  the  project;  these  correspond  to 
terminal  decisions.  Hence  the  model  could 
be  used  to  measure  whether  a  project 
succeeded  in  using  its  intended 
methodology . 

The  fact  that  this  new  model  predicts 
life-cycle  patterns  approximating  the 
Rayleigh  function  as  a  special  case  is  an 
advantage  since  empirical  evidence  has 
been  published  (2)  supporting  the  use  of 
that  function.  The  most  obvious  difference 
between  this  and  the  Rayleigh  curve  theory 
is  that  the  life-cycle  obtained  here  is 
defined  over  all  time  rather  than  just 
positive  time.  This  is  no  great  defect 
since  for  many  systems  it  is  impossible  to 
define  an  exact  date  when  the  first 
development  work  relevant  to  that  system 
was  done. 

Finally  it  may  be  possible  when 
analysing  an  actual  project  to  obtain  a 
much  more  detailed  picture  of  the 
dependency  relation  involved  by 

constructing  some  form  of  PERT  chart.  This 
would  enable  a  much  tighter  relationship 
milestones  in  development  than  has 
hitherto  been  possible.  The  accuracy  of 
estimates  of  completion  dates  should  be 
improved  in  consequence. 
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ABSTRACT 

Past  attempts  to  establish  mathematical  ex¬ 
pressions  that  can  predict  the  life  cycle  cost 
components  for  software  systems  have  achieved 
only  qualified  success.  The  mathematical  models 
for  these  relationships  included  only  variables 
that  describe  the  software  characteristics  and 
related  environmental  factors.  This  paper  pre¬ 
sents  the  hypothesis  that  software  cost  estimat¬ 
ing  relationships  must  include  the  effects  of  re¬ 
sources  consumed  in  one  life  cycle  phase  on 
other  phases.  Such  a  model  is  difficult  to  vali¬ 
date.  This  is  primarily  due  to  the  need  for 
greater  quantities  of  data  of  greater  precision 
than  is  usually  available.  However,  a  prelimi¬ 
nary  result  detained  from  existing  data  is 
positive.  Therefore,  additional  research  is 
justified. 


INTRODUCTION 

Our  objective  is  to  obtain  reliable  esti¬ 
mates  of  software  life  cycle  costs  suitable  for 
initial  planning.  This  requires  the  establish¬ 
ment  of  empirical  relationships  between  life 
cycle  cost  and  certain  variables.  We  call  these 
relationships  Cost  Estimating  Relationships 
(CERs) . 

Our  approach  to  establishing  CERs  for  soft¬ 
ware  life  cycle  costs l >2  has  been  similar  to 
other  researchers.  We  have  postulated  the  im¬ 
portant  relationships  between  the  products  and 
the  amounts  of  resources  needed  to  develop  and 
operate  them.  We  have  collected  project  data  de¬ 
scribing  resources  expended  and  program  character¬ 
istics  and  tried  to  substantiate  the  mathematical 
expressions.  We  have  achieved  relatively  limited 
success.  The  CERs  developed  to  date  have  relat¬ 
ively  low  precision  and  are  applicable  only  to  en¬ 
vironments  where  resource  reporting  and  defini¬ 
tions  are  comparable  to  that  in  which  the  data 
were  obtained.  Our  experience  is  not  unique.  The 
literature  shows  that  literally  hundres  of  soft¬ 
ware  and  environmental  parameters  occurring  sing¬ 
ly  and  in  almost  endless  combinations  have  been 
tested  in  efforts  to  predict  life  cycle  cost. 

Our  failure  to  obtain  quantitative  relation¬ 
ships  of  a  precision  comparable  to  those  available 
for  estimating  the  costs  of  hardware  systems  has 
let  us  to  question  the  assumption  that  was 


implicit  in  our  previous  analyses.  That  assumpt¬ 
ion  was  that  life  cycle  cost  is  predictable  using 
variables  that  simply  describe  the  product  and  the 
development  environment  (i.e.,  the  type  of  con¬ 
tract,  program  management  technique,  etc.).  We 
have  formed  the  opinion  that  while  all  these  fact¬ 
ors  are  relevant,  a  large  contribution  to  the  re¬ 
source  requirements  for  any  one  phase  derives 
from  the  ways  in  which  the  other  phases  are  com¬ 
pleted. 

The  way  the  project  is  executed  in  terms  of 
its  relationship  to  the  planned  development  and 
other  outside  pressures  determines  how  the  re¬ 
sources  are  consumed.  It  also  affects  the  quality 
of  the  delivered  product. 

We  believe  these  relationships  are  intuitive 
for  persons  experienced  in  project  management.  We 
would  like  to  develop  the  hypothesis  of  phase  in¬ 
terrelationships  by  walking  through  two  project 
histories  that  illustrate  the  concept.  In  the 
succeeding  sections  of  the  paper  we  will  propose  a 
model  and  show  the  results  of  applying  it  to  actual 
data. 

THE  CASE  FOR  PHASE  INTERRELATIONSHIPS 

An  important  factor  affecting  Che  utilization 
of  resources  is  the  need  to  conform  to  a  develop¬ 
ment  plan.  The  plan  is  an  essential  management 
tool  for  ensuring  that  needed  resources  are  avail¬ 
able  to  the  project  at  the  proper  time  and  in  the 
correct  amounts.  We  would  like  to  see  how  changes 
in  the  plan,  caused  either  by  changes  in  require¬ 
ments  or  by  failure  to  meet  commitments,  affect 
cost-driving  parameters.  Particularly,  we  would 
like  to  see  how  management  actions  influence  the 
measureable  project  descriptors.*  Understanding 
this  relationship  should  permit  a  more  accurate 
analysis  of  the  cost-driving  variables. 

A  Project  With  Significant  Interrelationships 

Figure  1  shows  the  time  spans  and  levels  of 
effort  for  the  different  phases  of  a  software 
development  project.  This  project  and  the  other 
one  In  this  section  are  business  data  processing 


Project  descriptors  include  man-hours  for  analy¬ 
sis,  coding,  testing,  etc.  (planned  and  actual), 
time  span  for  the  activities,  numbers  of  per¬ 
sonnel,  application  classification,  etc. 
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systems  developed  by  a  military  agency  for  world¬ 
wide  use.  Both  systems  were  written  in  COBOL. 
Planned  values  are  shown  by  solid  lines  and  actual 
values  by  dashed  lines. 

Notice  that  there  is  considerable  scheduled 
overlapping  of  the  Design  and  Coding  activities. 
This  overlapping  is  a  common  practice,  but  it  in¬ 
creases  the  likelihood  that  changes  in  the  design 
will  require  parts  of  the  system  to  be  recoded. 
Such  a  schedule  might  have  been  adopted  because 
time  was  short  or  because  people  were  available 
only  at  certain  times.  In  either  case,  over¬ 
lapping  causes  any  problems  or  delays  to  have  in¬ 
creased  impact  on  the  work. 


of  delays  and  parallel  activities  has  a  compound¬ 
ing  detrimental  effect  on  a  project  schedule.* 

For  example,  when  coding  is  begun  before  the  com¬ 
pletion  of  design,  the  designers  are  required  to 
coamunicate  their  results  to  the  programmers  in  a 
raw,  unqualified  state  (hence  significantly  in¬ 
creasing}  the  chance  of  design  errors).  Overlapping 
also  raises  the  possibility  that  the  designer  may 
not  charge  a  poor  procedure  when  he  discovers  it, 
because  he  has  already  committed  himself  to  the 
programmer.  Many  times  the  programmer  may  fill  in 
missing  information  by  himself.  By  doing  this  he 
may  introduce  errors  into  the  system  that  will  not 
be  discovered  until  late  in  the  testing  program 
when  repairs  will  be  time-consuming  and  expensive. 


The  Analysis  activity  of  the  project  was 
carried  out  at  about  the  planned  level  of  effort.* 
The  long  delay  in  its  completion  was  accompanied 
by  some  delay  in  the  start  of  the  Design  activity. 
This  latter  delay  was  probably  beneficial  since 
failure  to  delay  the  start  of  an  overlapped 
activity  can  Increase  scheduling  problems.  How¬ 
ever,  completion  of  Analysis  lagged  until  three 
months  after  its  scheduled  date,  and  in  the  mean¬ 
while  both  the  Design  and  Coding  activities  were 
started. 


We  are  not  yet  trying  to  build  a  case  for 
cause-and-ef feet  relationships  between  Increased 
consumption  of  resources  and  delays  and  over¬ 
lapping  activities.  We  are  simply  using  an  ex¬ 
isting  project  history  to  illustrate  how  inter¬ 
actions  among  activities  may  be  seen  to  influence 
the  expected  resource  requirements.  On  the  basis 
of  a  single  project  one  could  simply  conclude  that 
the  project  was  poorly  planned  and  executed. 

A  Project  With  Few  Interrelationships 


The  delay  in  the  completion  of  Analysis  in¬ 
dicates  that  some  needed  information  was  missing, 
some  procedures  were  not  defined,  or  unexpected 
problems  occurred.  Going  ahead  with  the  Design  and 
Coding  activities  would  almost  assure  an  increase 
in  the  time  required  to  code  and  test  the  system. 
This  is  a  good  example  of  one  type  of  interaction 
between  activities.  The  increased  coding  and  test¬ 
ing  hours  would  not  be  predicted  by  any  method  that 
relied  solely  on  parameters  related  to  those 
activities. 

The  delays,  along  with  the  substantial  over¬ 
lapping,  were  associated  with  a  significant  in¬ 
crease  in  the  resources  required  to  complete  the 
project : 


Estimated 

Actual 

Percent 

Activity 

Man-Months 

Man-Months 

Increase 

Analysis 

4.5 

6.6 

47 

Design 

9.1 

9.9 

9 

Coding 

Test  & 

4.5 

19.6 

336 

Integration 

Qualification 

4.2 

10.6 

152 

Testing 

2.2 

6,0 

173 

24.5 

52.7 

115 

The  figures  indicate  that,  for  this  project, 
delays  and  overlapped  activities  were  associated 
with  large  Increases  in  the  consumption  of  re¬ 
sources  over  what  was  expected.  The  combination 


We  divided  total  man-hours  by  time  span  to  de¬ 
termine  average  staffing.  Therefore,  any  gaps  in 
the  work  would  reduce  the  calculated  staffing. 


Figure  2  shows  a  project  that  was  completed 
in  a  better  fashion.  The  Analysis  and  Design 
activities  overlapped  but,  significantly,  the  de¬ 
sign  was  completed  before  coding  started.  Less 
effort  was  put  into  the  analysis  than  had  been 
planned,  but  there  was  an  Increase  in  the  hours 
required  for  the  design:** 


* 

This  is  not  to  suggest  that  systems  cannot  be  de¬ 
veloped  with  overlapping  activities.  Many  sys¬ 
tems  have  distinct  parts  that  can  be  coded  before 
the  entire  design  is  completed.  In  a  top-down  de¬ 
sign  where  coding  is  by  tiers,  the  coding  can 
often  begin  before  the  design  is  complete.  These 
are  planned  developments  that  would  permit  the 
overlapping  of  these  functions.  We  are  concerned 
here  with  the  situation  where  the  press  of  the 
development  schedule  or  the  slippage  of  preceding 
activities  results  in  overlapping  activities  that 
would  have  been  accomplished  better  sequentially. 
Fven  in  a  planned  implementation  of  parallel 
activities,  however  (and  this  includes  top-down 
design),  whenever  the  coding  begins  before  the  de¬ 
sign  is  completed  there  is  an  Increased  risk  of 
changes  to  the  design  or  of  mismatches  in  sub¬ 
system  interfaces.  The  project  management  must 
weigh  these  risks  in  relation  to  the  need  for  work¬ 
load  balancing  and  project  scheduling. 


** 

Many  software  development  activities  are  difficult 
to  define.  The  line  between  analysis  and  design 
becomes  blurred  in  practice.  In  some  Instances 
both  functions  are  performed  by  the  same  individ¬ 
ual,  who  may  also  do  some  or  all  of  the  coding. 

It  may  be  that  in  this  instance  some  of  the  analy¬ 
sis  hours  were  reported  as  design. 


Percent 


Estimated 

Actual 

Increase 

Activity 

Man-Months 

Man-Months 

(Decrease 

Analysis 

43 

24 

(44) 

Design 

12 

16 

33 

Coding 

28 

37 

32 

Test  6 

Integration 

12 

9 

(25) 

Qualification 

Test ing 

_3 

_6 

100 

98 

92 

(6) 

The  project  was  completed  on  schedule. 

Notice  that  the  project  described  in  Figure  2 
was  scheduled  to  have  the  Analysis  phase  continue 
until  after  the  completion  of  the  design  activi¬ 
ties.  This  occurs  most  often  when  the  system 
specifications  are  not  fully  developed  at  the 
start  of  the  project.  Functional  requirements  are 
allowed  to  change  during  the  Design  phase  much  more 
than  a  pragmatic  approach  would  dictate.  This  is 
the  case  with  many  information-system  developments 
where  management  participation  in  defining  func¬ 
tional  requirements  is  not  sufficient.  As  details 
of  the  design  become  established,  the  impacts  of 
the  specifications  become  more  apparent  to  members 
of  management,  and  their  reactions  require  changes 
in  the  specifications.  Many  project  managers, 
therefore,  do  not  attempt  to  finalize  the  system 
specifications.  Instead,  they  schedule  the 
Analysis  and  Design  phases  concurrently.  The 
period  of  analysis  after  the  completion  of  the  de¬ 
sign  is  used  to  complete  the  documentation  of  the 
specifications. 


in  explaining  the  cost  of  testing  is  the  effort 
given  to  the  earlier  activity  of  design.*  We  be¬ 
lieve  there  exists — in  effect —  the  following  type 
of  relationship. 


MAN-MONTHS  OF  DESIGN 


This  curve  indicates  that — over  some  range — if 
the  effort  given  to  design  is  reduced ,  there  will 
will  be  added  effort  required  during  the  testing 
(as  well  as  coding)  phases  because  of  errors  and 
difficulties  with  the  program.  Conversely, 
additional  time  spent  in  design  of  the  software 
will  reduce  subsequent  effort  required  for  test¬ 
ing  (and  for  coding). 

Similar  arguments  can  be  made  about  rela¬ 
tionships  between  resources  expended  in  coding  or 
testing  and  the  subsequent  effect  upon  required 
resources  after  the  software  is  Installed.  In 
these  cases,  man-months  of  coding  or  man-months 
of  testing  would  be  the  abscissa  and  man-months 
of  maintenance  would  define  the  ordinate. 


Another  consequence  of  allowing  the  complet¬ 
ion  of  specifications  to  wait  until  the  design  is 
completed  is  that  the  effort  required  to  make  the 
specification  changes  tends  to  be  reported  as  part 
of  the  coding  and  subsequent  activities.  Analysis 
of  project  data  from  this  point  of  view  suggests 
that  the  practice  may  be  quite  common.  Similar 
problems  of  distinguishing  product  and  resource 
consumption  occur  in  other  life  cycle  phases. 

This  complicates  the  analysis  of  phase  inter¬ 
relationships. 

The  preceding  discussion  has  been  presented 
in  support  of  the  contention  that  the  relation¬ 
ships  among  the  software  development  phases  may  be 
extensive  and  very  important  to  the  consumption  of 
resources.  As  will  be  shown  later,  these  relation¬ 
ships  extend  Into  the  Operation  phase. 

JUSTIFICATION  OF  THE  SELECTED  MODEL 
Tradeoffs  Between  Phases 

Considering  the  above  discussion,  we  are  in¬ 
terested  in  proving  a  methematlcal  relationship 
that  in  addition  to  some  measure  of  the  product 
resulting  from  the  expenditure  of  resources  con¬ 
tains  a  number  of  significant  interrelationships 
among  phases  of  the  software  life  cycle.  We  want 
to  show,  for  example,  that  a  significant  element 


These  interrelationships  pose  a  number  of 
distinct  analytical  problems.  First,  the  trade¬ 
off  indicated  above  is  shown  for  a  given  program 
size.  A  simplistic  attempt  to  correlate  actual 
design  and  testing  resources  without  including 
program  size  will  almost  invariably  result  in  a 
positive  correlation.  This  is  because  total  re¬ 
source  requirements  tend  to  increase  with  program 
size  (and/or  complexity)  in  a  typical  data  base. 

A  data  base  is  required  that  is  large  enough  to 
identify  the  variations  in  design  and  testing  for 
a  constant  program  size;  in  effect,  to  determine 
the  relationships  illustrated  below:** 


* 

The  importance  of  adequate  design  effort  is 
stressed  throughout  the  literature.  For  ex¬ 
ample,  Boehm,  et.al.,^  in  an  analysis  of  soft¬ 
ware  errors  found  that  design  errors  outweighed 
coding  errors  64%  to  36%.  Moreover,  design 
etiurs  took  far  longer  to  detect  and  correct. 

Mathematically,  these  relationships  can  be  ex- 

pressed  as  X^  -  a  PX~b  (or  P  -  a  X  bXc)  where 
t  1  a  o  d  t 

X^  -  design  resources,  Xt  -  test  resources,  and 

P  *  program  size  (or  complexity),  i.e.,  a  measure 
of  output. 
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*4  ANALYSIS  AND  DESIGN 

MM  CODING  AND  CHECKOUT 

MM  TESTING 

MM/ MO  MAINTENANCE 

CHANCES 


REPORTED  ERROR  RATE 
-  EQUAL  TO  IDEAL 
>  GREATER  THAN  IDEAL 
<  LESS  THAN  IDEAL 


NO  NO  NO  YES  YES  NO  NO 


Where  P. ,  P2,  P3  are  successively  larger  program 
sizes. 

The  Proposed  Model 

These  relationships  are  comparable  to  those 
discussed  in  basic  economic  tests  on  production 
theory.  However,  the  interrelationships  among  the 
phases  are  actually  more  complex.  Design,  coding, 
testing,  and  maintenance  are  all  interrelated.  A 
limitation  on  design  resources  may  be  passed 
through  all  the  way  to  maintenance.  When  the 
system  is  installed,  the  logic  and  coding  in¬ 
adequacies  stemming  from  insufficient  design  be¬ 
come  apparent  as  added  costs  for  program  revision. 
Graphically,  the  result  is  a  multi-dimension 
tradeoff  surface.  Arithmetically,  the  relation¬ 
ships  are  of  the  following  form: 

P  »  a  XC  Xd  Xk 
deem 

We  believe  that  a  few  relationships  should 
dominate  in  this  multi-dimensional  set  of  pro¬ 
spective  interrelationships.  The  most  important 
is  hypothesized  to  be  between  design  and  the  sub¬ 
sequent  testing  period.  If  the  results  of  the 
testing  period  were  reasonably  uniform  in  terms  of 
remaining  errors,  then  the  hypothesis  could  be 
left  simply  between  the  Design  and  Testing  phases. 
However,  the  number  of  errors  found  during  the 
Operations  phase  varies  widely  among  reported 
project  histories.  Thus,  we  broaden  the  hypothe¬ 
sis  to  state  that  reduced  resources — relative  to 
some  norm — given  to  the  design  phase  will  result 
in  greater  resource  requirements  for  testing  and / 
or  in  higher  error  rates  during  the  Operations 
phase. 

Similarly,  if  insufficient  resources  are 
committed  to  the  coding  phase,  one  can  expect 
(1)  a  requirement  for  more  extensive  resources 
during  Test,  and/or  (2)  higher  error  rates  during 
the  Operations  phase. 

Figure  3  illustrates  some  other  departures 
from  the  ideal  which  may  occur,  and  how  they  may 
be  reflected  in  the  error  rate  of  the  delivered 
software,  which  is  indicative  of  the  reliability 
of  the  software. 


Figure  3.  Postulated  Trade-Offs  Among 

Life-Cycle  Man-Hour  Parameters 


For  example,  the  second  column  indicates 
that,  with  all  other  activities  corresponding  to 
the  ideal  and  with  no  changes,  less  than  ideal 
effort  spend  on  coding  and  checkout  would  be  ex¬ 
pected  to  cause  a  higher  error  rate  of  the  de¬ 
livered  software  than  the  ideal.* 

In  conclusion:  the  development  of  low-risk, 
practical  life-cycle  cost  estimating  relation¬ 
ships  requires  the  consideration  of  the  inter¬ 
actions  among  the  activities  or  phases.  Further¬ 
more,  we  postulate  that  any  analysis  that  does  not 
include  these  interactions  will  not  succeed  in 
reducing  the  scatter  that  makes  existing  software 
cost  estimating  schemes  unsuitable  for  effective 
project  planning  and  control. 

Data  Collection  Problems 
* 

The  interactions  among  life  cycle  phases  de¬ 
scribed  by  the  above  model  pose  some  practical 
problems  in  data  collection  and  interpretation 
that  must  be  solved  before  valid  data  will  be  ob¬ 
tained.  To  firmly  establish  the  determinants  of 
software  costs,  we  believe  that  actual  cost  data 
must  be  recorded  in  keeping  with  a  refined  process 
model  that  describes  the  interactions  that  occur 
during  system  development . *  In  particular,  man¬ 
hours  expended  for,  say,  recoding  during  the 
nominal  testing  phase  should  be  recorded  as  such. 


One  might  argue  that  the  "ideal"  error  rate  would 
be  zero;  but  a  practical  solution  would  be  to 
avoid  spending  large  amounts  of  resources  to 
achieve  zero  errors.  Therefore,  it  would  be  ex¬ 
pected  that  proper  planning  would  allow  for  some 
small  acceptable  error  rate.  Obviously,  this 
tolerance  of  errors  does  not  apply  to  defense 
systems  or  man-rated  systems,  but  it  would  be 
acceptable  for  most  information  systems. 


An  adequate  cost-reporting  system  also  re¬ 
quires  a  corresponding  record  of  the  output  (e.g., 
lints  of  code)  that  is  associated  with  the  cost. 

This  is  one  of  the  major  problems  in  cost-control 
of  software  programs — it  is  comparatively  easy  to 
establish  what  costs  have  been  incurred;  the 
missing  element  is  the  amount  of  progress  that  has 
been  made. 

APPLICATION  TO  AVAILABLE  DATA 

There  is  no  shortage  of  mathematical  rela¬ 
tionships  describing  software  phenomena.  What  is 
in  pitifully  short  supply  is  reliable  data  with 
which  to  prove  them.  Our  selection  of  a  hypo¬ 
thesis  has  increased  the  quantity  and  precision  of 
the  data  required  to  prove  it.  However,  we  be¬ 
lieve  that  no  existing  approach  to  modeling  life 
cycle  costs  has  been  successful.  Therefore,  ex¬ 
tending  the  complexity  of  the  model  and  conse¬ 
quently  the  data  requirement  may  be  justified  if 
they  produce  a  better  prediction  of  resource 
requirements. 

It  is  especially  difficult  to  obtain  data  de¬ 
scribing  the  individual  phases  of  the  software 
life  cyle. 

Life  cycle  phases  are  not  defined  consist¬ 
ently  or  even  accurately  for  most  software 
development  projects.  The  activities  that  com¬ 
prise  analysis  as  opposed  to  design  are  not  clear¬ 
ly  stated  or  understood  by  most  project  managers. 
Furthermore,  most  of  them  don't  care  because  they 
have  more  important  problems  to  deal  with. 

If  a  milestone  concept  is  used  in  the  project 
management,  the  project  goes  from,  say,  design  to 
coding  at  a  specified  point  in  time.  If  there  are 
changes  to  the  design,  the  required  effort  will  be 
reported  as  coding. 

Even  when  projects  are  reported  by  activity, 
it  is  difficult  to  distinguish  consistently  among 
coding,  unit  testing,  and  system  testing.  These 
activities  may  be  conducted  at  several  levels 
simultaneously.  Changes  or  detected  errors  can 
require  reiterations  among  the  activities. 

As  a  preliminary  test  of  the  phase  interre¬ 
lationships  model,  we  attempted  to  use  some  available 
data  to  describe  the  tradeoff  between  Design  and 
Coding  and  Testing.  Coding  and  Testing  were  com¬ 
bined  in  order  to  avoid  the  problem  of  distin¬ 
guishing  between  these  two  phases. 

Data  was  available  from  fourteen  defense 
system  projects.  An  equation  of  the  form: 

Xo  * 


was  fit  to  the  data.  The  result  is  shown  in 
Figure  A. 

The  curves  show  a  definite  tradeoff  between 
design  effort  and  coding  and  testing.  Further¬ 
more,  there  is  a  diminishing  return  in  reducing 
coding  and  test  time  obtained  by  increasing  the 
design  effort  beyond  a  certain  point.  This  point 
increases  markedly  with  increasing  program  size. 
There  is  an  indication  of  a  very  heavy  penalty  for 
failing  to  perform  some  minimum  amount  of  design. 

CONCLUSIONS 

The  phase  interrelationships  hypothesis  is  an 
intuitively  satisfying  model  for  explaining  soft¬ 
ware  life  cycle  resource  requirements.  Looking 
along  this  line  is  Justified  by  the  failure  of 
simplier  approaches. 

A  preliminary  empirical  test  of  the  hypothe¬ 
sis  with  an  extremely  small  data  set  is  positive. 
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Where  I  •  Size  of  delivered  program  in 
1000s  of  object  instructions 

Wfp  ■  Kanmontha  of  system  design 
MMg^.  *  Manmonths  of  coding  and  testing 
a,  b,  c  »  Constants 
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Figure  1.  Scheduled  and  Actual  Activities  in  a  Software  Development: 
Example  1 


Figure  2.  Scheduled  and  Actual  Activities  in  a  Software  Development: 
Example  2 


Figure  4.  Relationships  Between  Design  Effort  and  Coding  and  Testing  Effort 
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ABSTRACT 


Perhaps  the  least  appreciated  area  of  modern 
software  technology  is  top-down  development.  Top- 
down  development  is  far  more  than  a  programming 
technique  suitable  for  application  to  an 
individual  work  assignment.  Top-down  development 
is  a  rich  and  powerful  technique  for  project 
implementation  and  for  system  integration.  The 
characteristics  of  the  top-down  process 
(executable  as  a  system  and  self-integrating) 
suggest  that  the  process  may  be  a  foundation  of 
integration  engineering. 

INTRODUCTION 


This  paper  provides  key  exerpts  from  a  recent  IBM 
technical  report  (FSD  78-0034)  by  the  authors.  In 
addition,  the  workshop  presentation  summarizes  the 
example  system  employed  in  the  full  report. 

A  number  of  trends  in  data  processing  appear  to 
justify  establishing  a  discipline  of  integration 
engineering.  These  trends  include: 

a.  Aggregating  larger  systems 

b.  Requiring  higher  availability 

c.  Distributing  systems 

d.  Developing  systems  concurrently. 

Mills  has  defined  software  as  logical  doctrine  for 
the  harmonious  cooperation  of  people  and  machines 
(1).  When  a  system  is  defined  to  be  a  coherent 
assemblage  of  people  and  hardware  with  specific 
capabilities,  the  system  engineering  and  related 
operating  rules  are,  in  fact,  implemented  in  the 
operational  software.  From  this  perspective,  we 
speculate  that  system  integration  and  integration 
engineering  technology  can  be  approached  from 
software  technology. 

Our  speculation  can  be  illustrated,  and  perhaps 
substantiated,  by  a  single  approach,  top-down 
development.  Top-down  development  is  one  of  the 
least  appreciated  facets  of  software  technology. 
Top-down  development  is  more  than  a  individual 
work  assignment  technique;  it  is  a  profound 
technical  and  managerial  strategy  as  well  (2). 


The  key  characteristics  of  top-down  development 
suggest  that: 

a.  The  evolving  software  is  always 
executable  as  a  system 

b.  The  process  is  sel f- integra t ing . 

An  adaptation  of  the  top-down  approach  was 
conceived  by  O'Neill  in  1972  and  demonstrated  in 
the  overlapped  development  and  integration  ol  the 
TRIDENT  Command  and  Control  System  (3).  The 
concept  is  that  software,  unlike  hardware,  can  be 
implemented  in  a  system  sense  to  be  always 
executable  and  that  system  integration  can  proceed 
from  the  software  (i.e.,  the  computer)  to  the  non¬ 
programmable  hardware. 

In  their  previous  report  H),  which  introduced  the 
term  "integration  engineering",  the  authors 
generalized  the  top-down  adaptation  by  defining  a 
top  for  any  system  and  by  relating  the  system  top 
to  an  integration  strategy.  The  top  of  any  system 
is  the  logic  for  transitioning  the  system  from 
state  to  state.  This  definition  is  general  since 
even  the  simplist  system  must  be  turned  on  and 
off.  In  more  complex  systems  there  are  many 
states  including: 

a .  Initialization 

b.  Maintenance 

c.  Development 

d.  Training 

e .  Reduced  function 

f.  Full  function 

g.  Termination. 

A  key  integration  engineering  hypothesis  is  that 
transitioning  logic  is  a  candidate  integration 
strategy. 

The  need  for  integration  engineering  technology  is 
greater  than  the  need  for  systems,  hardware,  or 
software  engineering  technology.  Far  more  tools 
exist  to  support  design  and  development.  In  fact, 
i mp 1 emen ta t ional  technology  has  progressed  beyond 
our  ability  to  effectively  employ  it.  The 
continuing  improvements  in  data  processing 
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hardware  have  led  to  the  introduction  of 
programmable  hardware  into  many  traditional  non¬ 
programmable  subsystems.  While  the  resulting 
systems  may  be  thought  of  as  distributed,  they 
often  result  from  subsystem  rather  than  system 
decisions . 


OBJECTIVES 


The  objectives  of  integration  engineering  will  be 
realized  by  an  approach  that  places  state 
transitioning  at  the  top  of  the  system.  Component 
boundaries  for  integration  are  not  drawn  between 
hardware  and  software.  The  boundaries  are  drawn 
around  analyses  that  define  the  system  states  and 
the  state  transitioning.  Maintenance,  training, 
and  reduced  function  states  thereby  become  an 
integral  part  of  the  design  and  development 
process,  and  appear  as  early  milestones  on  the 
integration  schedule. 

Early  deliveries  are  aimed  at  integrating  the 
lowest  operating  level  (i.e.,  the  maintenance 
state):  the  configuration  with  the  minimum 
available  hardware.  Each  successive  delivery  (and 
integration)  can  be  viewed  as  a  transition  from 
one  operating  state  to  the  next  higher  one.  After 
integration  of  the  final  deliveries  (full  up 
state),  the  preceding  deliveries  are  retained 
rather  than  discarded. 

Delivery  of  a  maintenance  state  allows  for 
integration  of  hardware/software  from  a  component 
level  through  the  subsystem  to  the  system  level 
with  continuous  availability  of  system  test 
elements.  This  allows  for  verification  of 
performance  at  each  level  of  integration  without 
dependence  on  higher  levels  performing  correctly. 

APPROACH 


Some  underlying  concepts  of  integration 
engineering  are  presented  in  the  following 
pa  ragraphs . 

Top-Down  Implementation 

The  top-down  approach  is  patterned  after  the 
natural  approach  to  system  design  and  requires 
that  programming  proceed  from  developing  the 
control  architecture  (interface)  statements  and 
data  definitions  downward  to  developing  and 
integrating  the  function  units.  Top-down 
programming  is  an  ordering  of  system  development 
which  allows  for  continual  integration  of  the 
parts  as  they  are  developed  and  provides  for 
interfaces  prior  to  the  parts  being  developed. 

Operating  Modes 

Operating  states  or  modes  describe  the  various 
configurations  of  system  resources  (people, 
programs  and  equipment)  which  operate  the  system. 


Preoccupation  with  the  full  operational  function 
state  can  lead  to  an  unsatisfactory  system  since 
the  early  work  must  also  specify  the  procedures 
required  to  transition  the  system  from  state  to 
state.  The  requirements  determination  process 
which  precedes  system  development  must  specify  the 
functional  availability  (e.g.,  tolerable  outage) 
requirements  to  guide  the  eventual  development  of 
procedure  configurations  for  each  state. 

Each  state  has  instrinsic  procedures  and  exists  in 
a  broader  procedural  environment  which  enables 
transition  from  state  to  state.  A  system  top  is 
the  transitioning  procedure  and  this  perspective 
helps  drive  the  entire  process  (function, 
development,  test,  operation). 

Prototypes 

Prototype  systems  are  developed  for  concept 
validation,  feasibility  determination,  or 
benchmark  calibration  purposes.  The  sussessful 
prototype  generally  represents  an  early  step  in  a 
major  system  acquisition.  While  the  successful 
prototype  overcomes  the  certainty  of  specific 
failures,  the  typical  prototype  does  not  guarantee 
full  scale  success.  Serious  problems  or 
disappointing  performance  may  arise  in  scaling  up 
to  an  operational  system.  Consideration  must  be 
given  to  scaling  problems  throughout  the  prototype 
planning. 

One  approach  to  prototyping  is  to  scale  down  from 
the  system  design  to  the  prototype  design.  In  an 
evolutionary  development  approach,  the  prototype 
is  a  subset  of  the  intended  system.  The  prototype 
may ,  in  fact,  be  an  operational  state  of  the 
intended  system. 

Integration 

Since  software,  unlike  hardware,  can  be 
implemented  in  a  system  sense  to  be  always 
executable,  the  system  integration  should  proceed 
from  software  (i.e.,  the  computer)  to  the 
nonprogrammable  hardware.  Phased  deliveries  of 
software  to  software/hardware  integration  ensure 
intermediate  evaluations  where  managers  can 
determine  progress  and  acceptabi 1 i ty  of 
concurrently  developed  components.  Phased 
deliveries  provide  the  means  of  integrating 
hardware  and  software  incrementally  rather  than  as 
one  enormous  task  late  in  the  project.  The 
problems  are  found  and  corrected  earlier  while 
more  schedule  is  available  for  retesting. 

The  test  system  should  also  be  built  incrementally 
by  phasing  its  development  to  match  the  system 
test  and  integiation  phasing. 

Earlier,  operating  states  were  described  arid  the 
state  transitioning  procedures  were  considered  to 
be  a  top  of  any  system.  A  process  is  postulated 
to  exploit  the  transitioning  procedures  as  a 
strategy  for  implementing  and  integrating  the 
larger  than  software  system.  The  process 
considerat ions  are: 
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a.  The  number  of  states  and  transitions 
required  for  integrating  the  system  may 
exceed  the  number  required  for  operating 
the  system. 

b.  The  testing  of  reduced  function  states  is 
progressive  and  is  an  integration  process 
by-product  rather  than  an  afterthought  to 
full  function  state  testing.  The  more 
critical  functions  (i.e.,  those  functions 
occurring  in  more  states)  receive  more 
tes t i ng. 

c.  Where  development  funding  is  spread  over 
many  years  (or  under  design-to-cost 
limits)  the  evolving  system  can  be 
executed  as  a  reduced  system  for  testing 
or  limited  operation. 

Incremental  Development 

The  top-down  approach  is  applied  in  discrete 
stages  (initial,  intermediate,  final)  to  improve 
manageability.  Each  stage  is  carried  to  readiness 
for  system  integration. 

The  initial  stage  begins  early  in  the  schedule, 
after  the  design  specifications  are  developed,  to 
exercise  and  validate  the  interface  between 
executive  and  functional  software  and  to  obtain  an 
early  validation  of  computer  utilization 
allocations . 

The  intermediate  stages  are  developed  to  exercise 
and  vaildate  the  functional  software  interfaces 
and  selected  critical  system  functions. 

Simulation  software  provides  inputs  to  the 
functional  software  in  lieu  of  actual  hardware 
inputs.  Intermediate  stage  software  is  used  for 
interface  validation  and  early  system  hardware 
integration. 

The  scheduling  of  initial  and  intermediate  stages 
must  provide  time  for  appraisal  and  redirection  of 
development.  The  final  stage  is  developed  to 
provide  all  functions  the  software  is  required  to 
perform. 

Software  integration  and  verification  is  a 
controlled  process  by  which  intermediate  and  final 
deliveries  are  integrated  in  simulation 
environments  which  at  successive  tests  more  fully 
approximate  actual  use.  The  objectives  are  to 
verify  that  the  integrated  software  will  perform 
its  specified  functions  in  the  simulated 
environment  and  to  reduce  the  number  of  problems 
encountered  during  system  test  and  evaluation. 

System  Readiness  Functions 

In  complex  systems,  the  various  assessment  and 
hardware  maintenance  support  functions  must  be 
integrated  into  the  operational  system.  These 
functions  include: 


a.  Interface  tests  to  assess  hardware 
integration  during  operation  and  testing 

b.  Operability  tests  to  assess  functional 
avai lability 

c.  Equipment  tests  to  assess  availability 
and  to  support  corrective  or  preventive 
maintenance 

d.  Component  status  determi nat ion  to  support 
reconfiguration 

e.  Data  extraction  for  presentation  or 
subsequent  analysis. 

In  addition,  the  personnel  readiness  functions 
(training  and  exercise)  can  be  integrated  into  the 
operational  system. 

Appli cab i 1 i ty  to  Subs ys terns 

The  analytical  process  by  which  a  system  is 
partitioned  into  states  can  be  applied  to 
subsystems  and  lower  level  subdivisions.  A  result 
of  the  process  is  the  transitioning  procedure 
which  employs  system  parameters  in  its  conditional 
logic.  While  we  may  consider  any  conditional 
expression  as  choosing  among  states,  at  some  point 
we  cease  to  distinguish  states  unless  a 
software/hardware/people  reconfiguration  or  major 
component  state  change  is  involved.  At  the  system 
level  we  may  wish  to  consider  intra-subsystem 
states  as  sub-states.  (Sub-states  have  the 
property  of  being  nested  within  states). 

Now  if  we  consider  testing  and  more  particularly 
integration  to  be  conditional  in  nature,  we  may 
consider  test  and  integration  as  transitioning 
procedures.  Similarly,  incremental  development 
may  be  considered  in  a  transitional  context. 


TRANSITION  MANAGEMENT 


A  key  hypothesis  of  integration  engineering  is 
that  the  state  and  transition  management  is  a 
natural  part  (i.e.,  the  top)  of  the  system  and 
should  not  require  schedule  extensions  for 
testing.  Transition  management  is  the 
implementation  of  the  integration  engineering 
approach  stated  under  APPROACH.  Some  work  remains 
to  be  done  in  order  to  generalize  transition 
implementation.  The  issues  include  the  degree  to 
which: 

a.  Transition  management  is  dependent  on  the 
operating  system 

b.  The  transition  management  function  is 
distributed  in  a  distributed  system 

c.  The  actual  configuration/state  is 
transparent  to  the  application  functions 
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d.  System  readiness  functions  are 
implemented . 

Transition  management ,  like  integration 
engineering,  exposes  most  system  design  issues 
from  the  basic  system  architecture  to  component 
reliability  requirements.  However,  every  system 
is  in  some  way  unique  in  the  parameters  that 
define  transition  management. 

Ope rating  Sys tem 

Transition  management  can  be  viewed  as  part  of,  a 
superset  of,  or  a  subset  of  the  operating  system. 

In  practice  transition  management  is  all  three. 

If,  however,  the  operating  system  interface  to  the 
functions  required  by  transition  management  (e.g., 
system  loader,  fault  recovery,  interrupt 
processing)  is  well  defined  and  documented,  then 
the  degree  of  dependence  would  be  minimized. 

Distributed  Management 

The  transition  management  function  is  completely 
dependent  on  the  system  architecture.  For 
example,  in  a  distributed  system  the  transition 
management  function  itself  could  be  distributed. 
The  issues  raised  here  are  the  same  as  those 
raised  in  connection  with  a  distributed  data 
management  system  (e.g.,  master/slave  versus  peer 
to  peer,  dead-lock  avoidance,  functional 
integrity).  The  resolution  of  these  issues  leads 
directly  to  questions  about  global  status  data, 
information  hiding,  and  the  connectivity  between 
nodes . 

Application  Functions 

Obviously,  the  application  functions  must  be 
designed  to  be  independent  of  the  system 
configuration.  This  is,  however,  achievable  in 
any  system  and  made  easier  by  functional  bus 
addressing  (4). 

System  Readiness 

The  system  readiness  functions  are  the  single  most 
important  element  to  transition  management,  and 
one  of  the  critical  tools  for  integration 
engineering.  If  the  system  has  the  flexibility 
and  redundancy  to  totally  (or  partially)  recover 
from  a  single  failure,  but  cannot  detect  the 
failure  when  it  occurs,  then  the  system 
flexibility  is  wasted. 

The  system  readiness  functions  must  provide  fully 
integrated,  online  system  tests  and  equipment 
diagnostics  that  operate  in  realtime.  The 
function  should  continually  assess  system 
readiness  to  meet  the  defined  mission  while 
causing  minimum  degradation  to  the  operational 
system.  The  readiness  functions  should  include 
the  following  tests: 

a.  Operability  tests  assess  the  capability 
of  the  entire  system  to  perform  as 
required.  The  test  should  address  the 
system  as  a  whole  and  verify  that  the 


various  complete  functions  can  be 
performed. 

b.  Interface  tests  assess  the  capability  of 
all  hardware  units  to  intercommunicate 
properly  and  to  perform  their  electrical 
functions  correctly.  This  test  should 
address  each  individual  hardware  unit  and 
all  of  its  signal  interfaces.  Delected 
hardware  unit  failures  should  be 
immediately  displayed. 

c.  Alignment  tests  assist  in  the  assessment 
of  sensor  and/or  system  alignment 
parameters . 

d.  Performance  monitoring  tests  check  each 
equipment  unit  and  interface  for  proper 
operation. 

e.  Fault  location  tests  exercise  the  unit  to 
make  a  detailed  determination  of  the 
exact  nature  and  location  of  the  failure 
for  repair.  They  require  that  the 
subject  equipment  be  dedicated  to  testing 
and  repair. 

These  tests  are  required  to  operate  ’’online”  in 
that  they  will  run  concurrently  with  the 
operational  program  without  any  system 
degradation,  as  opposed  to  an  "offline"  test  which 
is  conducted  independently  of  the  operational 
program  and  prohibits  that  program  from  meeting 
its  designated  operational  requirements. 
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ABSTRACT 

In  the  development  of  large  scale  computer  soft¬ 
ware  and  in  the  management  of  the  development  pro¬ 
cess  it  is  often  useful  to  model  the  reliability 
and/or  the  cost  of  development  of  these  software 
packages.  The  literature  has  many  references  that 
assume  a  model  and  show  its  usefulness  as  a  manage¬ 
ment  tool.  The  reader  is  referred  to  references 
[29  through  41].  Several  of  these  publications  use 
Bayesian  methodology.  There  are  divided  opinions 
as  to  the  application  of  some  of  these  models,  e.g. 
some  authors  feel  the  exponential  distribution  found 
so  useful  in  the  reliability  field  for  hardware 
should  not  be  applied  to  software.  Others  vehement¬ 
ly  disagree.  In  fact  some  even  show  data  that  seem 
to  substantiate  their  belief.  The  papers  relating 
to  Bayesian  methodology  also  assume  a  prior  distri¬ 
bution  [34,35,36,37].  These  assumptions  in  turn 
can  also  be  challenged.  In  order  to  get  an  idea  of 
what  type  of  a  probability  distribution  might  be 
apDlicable  and  what  mathematical  form  might  be  ap¬ 
propriate  we  offer  here  a  structured  approach  in 
assessing  the  probability  distribution  subjectively. 
It  is  possible  to  base  the  analysis  on  either  (l)the 
subjective  nrior  distribution  when  no  test  data  are 
available,  or  on  (2)  a  posterior  distribution  which 
with  the  use  of  Bayes'  Theorem,  combines  the  prior 
distribution  with  the  likelihood  function  (the  sam- 
pl ing  evidence) . 

It  is  probably  true  that  the  engineering  prac¬ 
titioners  by  and  large  are  not  familar  with  Bayesian 
statistical  concepts.  Of  course,  there  are  excep¬ 
tions.  This  paper  offers  a  methodology  of  assessing 
a  prior  distribution  subjectively.  Once  this  has 
been  done  the  general  shape  of  the  distribution  can 
be  ascertained, then  the  search  for  the  mathematical 
form  is  greatly  simplified.  For  instance,  the  pro¬ 
bability  distribution  may  be  skewed,  not  exist  for 
negative  values  of  the  random  variable.  This  would 
eliminate  a  whole  series  of  probability  models  like 
e.g.,  the  normal  distribution  and  give  rise  to  a 
host  of  others.  While  it  is  still  possible  to  se¬ 
lect  a  model  from  many  available  basic  models,  the 
selection  process  is  at  least  based  upon  some  evi¬ 
dence,  namely  the  opinion  of  the  experts  in  charge 
of  developing  the  software  package.  Two  computer 
programs  were  written. 

a.  Assessing  a  subjective  prior  distribution  by 
elicitating  answers  to  questions  on  a  CRT 
(Cathode  Ray  Tube).  The  answers  to  these 
questions  are  used  to  plot  the  distribution 
function  as  well  as  the  density  function. 


b.  From  the  general  shape  of  these  functions  a 
family  of  probability  functions  are  sugges¬ 
ted.  For  instance  an  inverted  gamma  distri¬ 
bution,  a  beta  distribution,  or  say  a  log 
normal  distribution  might  be  hypothesized. 
Some  of  the  summary  output  of  the  first 
program  become  inputs  for  finding  the  para¬ 
meters  of  the  assumed  distributions 

An  example.of  a  log  normal  distribution  is  used 
but  other  families  of  distributions  could  have  been 
selected  as  well . 

This  paper  does  not  explicitly  deal  with  the 
derivation  of  the  posterior  distribution  which  is 
found  via  Bayes'  Theorem  in  conjunction  with  in¬ 
coming  data.  The  prior  distribution,  however,  is 
an  essential  part  of  finding  the  posterior  dis¬ 
tribution.  If  the  prior  distribution  found  is  in¬ 
tegrated  with  test  information  as  data  become 
available,  then  obviously  this  is  more  complete 
information  than  just  test  information  alone. 

INTRODUCTION 

The  importance  of  consistent  prior  distribu¬ 
tions  is  two-fold.  First,  these  distributions  re¬ 
flect  consistent  initial  predictions  because  they 
are  developed  by  a  structured  process.  Second, 
these  distributions  are  the  starting  point  for 
applying  Bayes'  Theorem  to  develop  the  posterior 
distribution  by  modifying  the  prior  distribution 
with  actual  data  available  later. 

This  paper  will  show  how  a  prior  distribution 
can  be  found  subjectively,  even  though  no  collater¬ 
al  data  are  available.  Then,  once  this  has  been 
achieved,  a  family  of  known  probability  functions 
is  used  to  ascertain  if  the  found  prior  distribu¬ 
tion  belongs  to  the  given  family.  This  paper  does 
not  address  the  subject  of  determining  the  poster¬ 
ior  distribution. 

Two  interactive  computer  programs  were  writ¬ 
ten. 

a.  Subjective  assessment  of  fractiles. 

b.  Using  some  of  the  frac tiles  found  in  (a), 
the  parameters  of  the  log  normal  distribu¬ 
tion  are  found.  A  log  normal  distribution 
has  been  selected  as  an  example  but  any 
other  pertinent  distribution  could  have 
been  used. 
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PREVIOUS  WORK 

More  recently,  decision  theory  has  been  considered  as  a  general 
framework  for  logical  analysis  of  a  decision  problem  under 
uncertainty.  As  such,  considerable  attention  has  been  given  to 
problem  formulation  and  methods  for  the  assessment  of  a  prior 
distribution.  For  example,  Schlaifer’s  recent  book  (221  is  largely 
devoted  to  the  formulation  and  prior  analysis  of  decision  problems; 
posterior  analysis  is  discussed  only  in  the  last  part  of  the  text.  Howard 
and  his  associates  (see,  for  example,  III)  and  1 25) )  have  emphasized 
the  application  of  decision  theory  to  complex,  dynamic,  and  uncertain 
decision  problems.  In  dealing  with  these  problems,  they  have 
explicitly  included  the  problem  formulation  phase  in  the  decision 
analysis  cycle. 

Decision  theory,  either  concerned  with  specific  models  or  general 
frameworks,  treats  uncertainty  through  subjective  probability  and 
treats  attitude  toward  risk  through  utility  theory.  Regardless  of 
whether  the  decision  maker  is  concerned  with  prior  or  posterior 
analysis,  the  prior  probability  distribution,  reflecting  his  quantified 
judgments  about  uncertainty,  is  an  indispensable  input  to  the  analysis. 

One  difficulty  associated  with  probability  assessment  is  the 
assessor’s  inconsistencies  which  often  occur  in  formulating  a  prior 
distribution.  The  question  of  how  to  discover  and  remove 
inconsistencies  is  of  general  interest  to  decision  analysts.  Another 
question  of  interest  is  how  to  fit  a  probability  distribution  using  the 
assessed  fractile  in  order  to  make  the  subsequent  analysis  more 
tractable.  Both  of  these  questions  are  addressed  in  this  paper  The 
paper  offers  two  computer  programs.  The  first  program  allows  a 
person  to  interact  with  the  computer  via  a  graphical  device  (Cathode 
Ray  Tube  (CRT|)  during  his  course  of  establishing  a  subjective 
distribution.  The  second  program  fits  a  lognormal  distribution  to  the 
subjective  distribution. 

During  recent  years,  subjective  probability  has  been  studied  by 
researchers  in  various  disciplines  such  as  psychology,  mathematics, 
statistics,  engineering,  and  business  administration  (as  evidenced  by 
the  references  at  the  end  of  the  paper).  While  some  ol  these  studies  are 
mainly  theoretical  or  philosophical,  others  are  experimental. 

In  their  text  (17),  Pratt.  Raiffa.  and  Schlaifer  present  the 
method  of  equally  likely  subintervals.  Subsequently.  Raiffa  (18] 
illustrates  this  method  in  detail  by  providing  a  dialogue  between  a 
decision  analyst  and  his  client.  Schlaifer  f  2  2 1  advocates  this  method 
and  offers  a  computer  program  for  fitting  a  cumulative  function 
through  assessed  frac tiles. 

For  his  experimental  study,  Winkler  (26)  developed  a 
questionnaire  using  four  assessment  techniques;  (a)  Cumulative 
Distribution  Function  -  assessment  of  fractites  by  means  of  equally 
likely  subintervals  or  direct  questions  regarding  fractiles, 
(b)  Hypothetical  Future  Samples,  (c)  Equivalent  Prior  Sample 
Information,  and  (d)  Probability  Density  Function.  He  used  this 
questionnaire  to  elicit  prior  distributions  from  38  selected  subjects 
involved  in  his  study. 

The  use  of  penalty  functions,  or  scoring  methods,  has  been 
disarmed  by  several  researchers  as  means  of  encouraging  honest 
assessments.  Specifically ,  de  Finetti  (3|  presents  the  quadratic  scoring 
ride.  Savage  (21)  derives  the  general  class  of  strictly  proper  scoring 
rules  by  considering  probabilities  as  special  cases  of  rates  of 
substitutions.  Winkler  discusses  the  use  of  scoring  rules  and  other 
payoff  schemes  (27)  and  reports  his  experimental  results  ( 28 ) . 

Steel  von  Holstein  and  his  asweiates  <(24]  and  (251 )  focus  on 
the  subject  of  eliciting  the  opinions  of  experts  in  practical  situations 


rather  than  laboratory  experiments.  They  discuss  probability  encoding 
in  the  context  of  decision  analysis  and  propose  the  use  of  a 
probability  wheel  to  facilitate  the  encoding  process. 

At  the  Reliability  Conference  in  1970,  Lin  and  Schick  (13) 
presented  the  use  of  an  on-line  computer  system  to  assist  a  person  in 
developing  a  prior  distribution  to  represent  his  beliefs.  Although  the 
console-aided  procedure  is  illustrated  by  a  problem  in  the  reliability 
field,  this  procedure  is  applicable  to  assessment  of  any  prior 
distribution.  Since  then,  considerable  experience  with  this  procedure 
has  been  gained  from  experiments  involving  students  in  several 
statistics  and  decision  theory  classes  at  the  University  of  Southern 
California. 

The  present  paper  results  from  the  authors’  continued  effort  in 
making  the  probability  assessment  more  practical  by  using  modem 
electronic  computers.  This  paper  offers  a  newly  designed  computer 
program  which  has  incorporated  the  experience  gained  from  the  use  of 
the  previous  program.  To  simplify  the  assessment  procedure,  the  new 
program:  (a)  reduces  the  number  of  questions  significant/y  (from  12 
to  6),  (b)  is  highly  conversational  and  interactive,  (c)  checks  for 
consistency  as  the  user  answers  question  by  question,  (d)  uses 
graphical  display  rather  than  the  typewriter  terminal  to  help  the  user 
visualize  the  assessment  process  as  well  as  to  greatly  increase  the  speed 
of  drawing  the  assessed  probability  curves,  and  (e)  plots  not  only  the 
cumulative  function  but  also  the  density  function.  Once  a  subjective 
distribution  has  been  determined,  a  second  computer  program  will  fit 
a  lognormal  distribution  to  the  subjective  distribution  to  make  the 
subsequent  analysis  of  debugging  problems  more  tractable 
mathematically. 

METHOD  OF  ASSESSMENT 

Several  methods  have  been  suggested  for  estimating  prior 
distributions  (sec.  for  example.  |9).  |17),  and  1 2t» |  >  Our  computer 
program  makes  use  of  the  method  of  equally  likely  subintervals,  which 
perhaps  is  the  most  commonly  used  approach.  The  basic  idea  of  this 
method  is  to  ask  the  decision  maker,  at  any  stage,  to  divide  a  given 
interval  into  two  judgmentally  equally  likely  subintervals. 

To  begin  with,  the  interval  covering  all  possible  values  of  an 
uncertain  quantity  (usually  called  a  random  variable)  is  split  into  two 
subintervals  and  the  decision  maker  is  asked  to  choose  which 
subinterval  to  bet  on.  The  dividing  point  is  then  changed  until  he  feels 
indifferent  between  betting  on  one  or  the  other  subintcrv?'  When  the 
indifference  point  is  reached,  the  decision  maker  feels  that  it  is  equally 
likely  that  the  actual  value  of  the  uncertain  quantity  will  fall  above  (to 
the  right  of)  or  below  (to  the  left  of)  this  point.  The  indifference 
point,  which  divides  the  entire  interval  into  two  subintervals  with 
equal  probabilities,  is  the  median.  Next,  the  decision  maker  is  asked  to 
specify  a  point  which  will  further  divide  the  subinterval  to  the  left  of 
the  median  into  two  equally  likely  parts.  This  new  point  is  the  firet 
quart ile.  Similarly,  the  subinterval  to  the  right  of  the  median  may  be 
further  divided  into  two  equally  likely  parts.  The  decision  maker  may 
proceed  in  this  manner  to  divide  any  given  interval  (generated 
previously)  into  two  equally  likely  subintervals. 

Suppose  we  let  xk  designate  the  k**1  fractile  of  the  uncertain 
quantity  x.  i.c  . 

f»(x<xk)-k  .  0<  k  <  I 

Then,  using  the  method  of  equally  likely  subintervals,  Che  decision 
maker  is  asked  to  respond  to  a  series  of  questions  which  will  lead  to  a 
determination  of  xk  values  for  such  k  as  O  S,  0.2$,  0.75,  etc. 
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COMPUTER  PROGRAM 

The  program  stores  a  set  of  questions  tor  the  method  of  equally 
hlu  subintervals  Thi  questions  are  displayed  successively  on  a 
CRT,  the  user  responds  to  the  questions  by  typing  his  answers  on  a 
teletype  The  response  to  each  of  the  questions  is  processed 
immediately  and  checked  for  logical  consistency. 

Assuming  you  an*  the  user  of  the  program,  the  first  question  calls 
for  the  lower  limit  of  the  probability  distribution  by  asking  you  to 

‘‘Specify  the  largest  value  such  that  you  feel  virtually 
certain  that  the  actual  value  of  the  uncertain  quantity  will 
fall  above  this  vjlue  '* 

The  second  question,  on  the  other  hand,  calls  for  the  upper  limit  of 
the  distribution  by  asking  you  to 

“Specify  the  smallest  value  such  that  you  feel  virtually 
certain  that  the  actual  value  of  the  uncertain  quantity  will 
fall  below  this  value  " 

In  terms  of  the  fractile  notation  described  earlier,  the  first  question 
asks  lor  xq  and  the  second  question  asks  for  xj.  The  program  will 
check  to  see  if  xq  is  less  than  X|  and  if  you  feel  virtually  certain  that 
the  actual  value  of  the  uncertain  quantity  will  lie  in  between  xq  and 

M 

The  third  question  asks  you  to  divide  the  interval  defined  by  the  limits 
Xq  and  x  |  into  two  equally  likely  subintervals  The  question  says: 

“Specify  the  value  such  that  you  feel  it  is  equally  likely  that 
the  actual  value  of  the  uncertain  quantity  will  fall  above  or 
below  tins  value  " 

The  answer  to  this  question  yields  xq  5.  which  should  lie  in  between 
Xq  and  x  | 

The  fourth  question,  which  calls  for  xq  •*<;.  is: 

“Suppose  you  were  told  that  actual  value  is  less  than  xq  $. 
Specify  the  value  such  that  it  is  equally  likely  that  the 
actual  value  of  the  uncertain  quantity  is  either  above  or 
below  this  value  " 

The  program  will  check  to  see  if  this  answer  lies  in  between  xq  and 
x0  5 

The  fifth  question,  which  calls  for  xq  75.  is 

“Suppose  you  were  told  the  actual  value  is  greater  than 
xq  5.  Specify  the  value  such  that  it  is  equally  likely  that  the 
actual  value  of  the  uncertain  quantity  is  either  above  or 
below  this  value  “ 

This  answer  is  checked  to  see  if  it  lies  in  between  xq  $  and  x  j 

At  this  point,  the  program  further  checks  for  consistency 
Specifically,  it  asks 

“Now.  do  you  feel  it  is  equally  likely  that  the  actual  value 
of  the  uncertain  quantity  will  lie  within  the  interval 
between  xq  >5  and  Xq  75  or  outside  of  this  interral*” 


The  assessments  thus  obtained  are  summarized  on  the  CRT  The 
program  then  fits  a  smooth  cumulative  distribution  function  through 
the  assessed  frac tiles.  At  your  request,  it  will  plot  the  cumulative  curve 
and  the  corresponding  density  curve.  If  these  graphs  do  not  seem  to 
reflect  your  judgments  about  the  uncertain  quantity,  you  will  be 
guided  by  the  program  to  revise  your  previous  responses.  Whenever 
you  are  satisfied  with  the  assessed  distribution,  the  mean  and  the 
standard  deviation  are  computed.  In  addition,  you  may  ask  for  0  005. 
0.015.  0.035. ....  0.^5  fractiles  of  the  distribution. 

COMPUTER  OUTPUT 

To  illustrate  the  computerized  method  of  probability  assessment 
discussed  above,  the  computer  output  of  an  example  is  presented  In 
this  example,  the  expert  is  asked  to  quantify  lus  judgments  concerning 
the  debugging  hours  for  a  particular  job  As  we  can  see  from  this 
output,  the  expert  violates  some  of  the  probability  axioms  and  is 
asked  to  revise  hts  responses  several  times. 

THIS  PROGRAM  IS  DESIGNED  TO  ASSIST  YOU  IN  (Al  QUANTIFYING  YOUR 
PROBABILITY  JUDGMENTS  CONCERNING  AN  UNCERTAIN  QUANTITY. 
IB)  CALCULATING  THE  MEAN  AND  STANDARD  DEVIATION  OF  THE  PROBABILITY 
DISTRIBUTION  OBTAINED  FROM  THIS  QUANTIFICATION.  AND  (C)  FITTING  THE 
ASSESSED  DISTRIBUTION  TO  A  THEORETICAL  DISTRIBUTION 

WHAT  IS  THE  UNCERTAIN  QUANTITY  OF  YOUR  CONCERN  NOW’ 

NUMBER  OF  DE  HUGGING  HOURS 

PLEASE  RESPOND  TO  THE  FOLLOWING  QUESTIONS  WITH  YOUR  CAREFUL 
JUDGMENTS 

ID  SPECIFY  THE  LARGEST  VALUE  SUCH  THAT  YOU  FEEL  VIRTUALLY 
CERTAIN  THAT  THE  ACTUAL  VALUE  OF  THE  UNCERTAIN  QUANTITY 
WILL  FALL  ABOVE  THIS  VALUE 

650 

12)  SPECIFY  THE  SMALLEST  VALUE  SUCH  THAT  YOL>  FEEL  VIRTUALLY 
CERTAIN  THAT  THE  ACTUAL  VALUE  OF  THE  UNCERTAIN  QUANTITY 
WILL  FALL  BELOW  THIS  VALUE. 

100 

SORRY.  YOU  HAVE  MISINTERPRETED  THESE  TWO  QUESTIONS  QUESTION  111 
ASKS  FOR  THE  LOWER  LIMIT  OF  THE  UNCERTAIN  QUANTITY  PLEASE  GIVE  YOUR 
NEW  ANSWER  TO  111 

too 

QUESTION  (21  ASKS  FOR  THE  UPPER  LIMIT  OF  THE  UNCERTAIN  QUANTITY 
PLEASE  GIVE  YOUR  NEW  ANSWER  TO  121. 

950 

YOUR  ANSWERS  TO  THESE  TWO  QUESTIONS  IMPLY  THAT  YOU  FEEL  VIRTUALLY 
CERTAIN  THAT  THE  ACTUAL  VALUE  OF  THE  UNCERTAIN  QUANTITY  WILL  LIE  IN 
BETWEEN  100  AND  650  DO  YOU  AGREE’ 

NO 

THEN  YOU  MUST  REVISE  YOUR  ANSWERIS1 
PLEASE  TYPE  YOUR  NEW  ANSWER  TO  <11 
50 

PLEASE  TYPE  YOUR  NEW  ANSWER  TO  12) 

990 

13)  SPECIFY  THE  VALUE  SUCH  THAT  YOU  FEEL  IT  IS  EQUALLY  LIKELY 
THAT  THE  ACTUAL  VALUE  OF  THE  UNCERTAIN  QUANTITY  WILL  FALL 
ABOVE  OR  BELOW  THIS  VALUE 

no 

14)  SUPPOSE  YOU  WERE  TOLO  THAT  THE  ACTUAL  VALUE  IS  LESS  THAN  2B0 
SPECIFY  THE  VALUE  SUCH  THAT  IT  IS  EQUALLY  LIKELY  THAT  THE 
ACTUAL  VALUE  OF  THE  UNCERTAIN  OUANTlTY  IS  EITHER  ABOVE  OR 
BELOW  THIS  VALUE 

J  to 


If  the  check  n  not  met.  the  program  wilt  direct  you  to  review  and 
revise  each  uf  your  previous  answers  Otherwise,  the  program  will 
proceed  to  ask  you  to  specify  the  most  likely  value  (the  mode) 


USER  S  RESPONSE  APPEARS  IN  ITALICS 
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I  MLICVf  YOU  WON'T  ANSWER  THIS  OUBTKM  CAREFULLY.  PLEASE  GIVE  IT 
•OMC  THOUGHT. 

00  YOU  NOW  REALIZE  WHAT  YOU  610  WRONG? 

NO 

YOON  ANSWER  TO  (4>  SHOULD  HAVE  BEEN  IN  BETWEEN  SO  AND  280.  PLEASE 
TYPE  YOUR  NEW  ANSWER. 

IX 

(»)  SUPPOSE  YOU  WERE  TOLD  THAT  THE  ACTUAL  VALUE  IS  GREATER  THAN 
280.  SPECIPY  THE  VALUE  SUCH  THAT  IT  IS  EQUALLY  LIKELY  THAT  THE 
ACTUAL  VALUE  OF  THE  UNCERTAIN  QUANTITY  IS  EITHER  ABOVE  OR 
BELOW  THIS  VALUE. 


Figure  1.  Dutnuution  and  Denwty  Functions  tor  De  Bugging  Hours 
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AN  APPLICATION  FROM  PROGRAM  VERIFICATION 

From  the  assessment  procedure  given  eariier,  several  fractile 
points,  the  mean,  and  the  standard  deviation  are  available  in  the 
summary  output  of  the  computer  program.  Any  two  fractile  points,  or 
a  fractile  point  and  the  mean,  or  a  fractile  point  and  the  standard 
deviation  etc.,  can  be  used  to  determine  the  parameters  of  the 
lognormal  distnbution.  A  new  program  was  developed  that  allows 
some  20  different  input  combination  pairs  in  the  procedure  for 
determining  the  parameters  of  the  lognormal  distribution.  The  density 
function  of  the  lognormal  distribution  is  given  by: 

;i°0  <■> 

where  a  and  0  are  the  parameters  of  the  lognormal  distribution 

It  is  well  known  that  the  mean  E(x)  and  the  variance  V(x)  are  given 
by: 

E(x)  =  n  -  exp(a  +  (1/2|  02) 

V(x)  =  o2  =  p2[(exp0-)-l] 

The  mode  of  this  distnbution  is  at 
MODE  *  exp  (o  -  0:) 

while  the  median  or  50th  percentile.  Pjq,  is  at 

P5o  =  'a 

By  letting  y  =  1  n  ~r»  a-  in  (i)  and  using  standard  normal  tables,  the 
Wth  percentile  was  found  to  be 

P90  =  exp  < I  2820  +  a) 

Other  fractile  points  can  be  found  in  a  similar  fashion. 

As  we  have  seen  the  lognormal  distribution  has  two  parameters  a 
and  0  Thus  to  fit  a  lognormal  distribution  to  the  subjectively  derived 
distribution  we  only  have  to  specify  two  values  such  as  P$q  and  PgQ. 
or  the  mean  and  the  standard  deviation.  For  the  following  example 
the  mode  =  200  and  the  median  =  250  are  used.  The  program 
output  includes  a  distribution  function  and  a  density  function.  The 
latter  is  given  in  Figure  3. 


10C  NORMAL  DISTRIBUTION 


DO  YOU  NEED  THE  COMBINATION  PRINTOUT?  YES-1,  NO-O  ?0 

WHAT  IS  THE  INPUT  COMBINATION  NUMBER  713 
MEDIAN  -  7250 
MODE  •  7200 

mi 

ALPHA  BETA  MEDIAN  MEAN  STD  DEV  MODE  PCTlf  TIME 

5.5215  0.4724  2500000  27*500  139.7542  200  0000  45*  0*42 


DO  YOU  WISH  TO  PRINT  X  AND  Y-NO-0.  YES-1.  RETURN-2  71 

WHAT  IS  XMIN.XMAX.DELX 

7100,650,20 


X-VAUIES 

Y-VAU1ES 

X-VALUES 

Y-VAUJES 

100 

1.2870*-® 

400 

1. 2870*-® 

120 

2. 10476E® 

420 

1. 10016C-® 

140 

2. 8401  If  -03 

440 

4.  374371-04 

160 

3. 37814E-03 

460 

7.4806C-04 

100 

3.6840*-® 

480 

6. 7808* -01 

200 

3.7768*-® 

500 

5. 75580E-04 

220 

3. 707* -03 

520 

4.88276E-04 

240 

3.5®7*-® 

540 

4.140401-04 

260 

3.2370*-® 

560 

3.5115*04 

280 

2.4306*-® 

580 

2.47842-04 

300 

2.61306E-® 

600 

2.52707E  04 

320 

2. 30232E-® 

620 

2  145162-04 

360 

1. 74150E-® 

640 

l.  8220* -04 

380 

1.50047E-® 

DO  YOU  WISH  TO  PLOT  X  AND  Y;  NO-0,  YES-1  ?l 


Now  the  distnbution  function  or  density  funcUon  can  be  visually 
compared  with  the  subjectively  derived  pnor  distnbution  using  the 
questionnaire  involving  the  debugging  hours.  If  “reasonable*' 
agreement  has  been  achieved  the  mathematical  form  of  the  density  has 
been  found  Several  combinations  of  input  values  might  have  to  be 
examined  in  order  to  achieve  the  “best’'  lit  This  form  is  important  in 
order  to  establish  the  posterior  distnbution  using  incoming  data  and  the 
likelihood  (unction  according  to  Bayes’  Theorem.  On  the  other  hand,  if 
“reasonable”  agreement  between  the  two  distribution  functions  has  not 
been  achieved,  a  new  family  of  distributions  may  be  tried  and/or  the 
empirical  distnbution  might  be  questioned.  Ultimately,  agreement  will 
be  found  unless  the  lognormal  distribution  is  not  a  valid  model 


*  DO  YOU  WISH  TO  INrtGRAlE-N0-0.  YIS-1.  RETURN-2  70 
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ABSTRACT 

A  new  discipline  for  improving  the  design  process 
of  large  complex  systems  has  been  proposed.  The  dis¬ 
cipline  consists  of  a  modeling  technique  (Design  Pro¬ 
cess  Analysis  Modeling)  combined  with  a  set  of  ana¬ 
lytical  procedures.  The  Design  Process  Analysis  Mo¬ 
del  and  procedures  are  described,  and  some  appli¬ 
cations  to  such  areas  as  computer  logic  design  and 
software  chief  programmer  teams,  are  discussed.  A 
number  of  potential  benefits  of  use  of  the  discipline 
are  identified,  including:  1)  Verifying  that  both 
technical  design  goals  and  organizational  objectives 
are  being  met  by  the  design  process;  2)  Providing  a 
common  approach  for  evaluating  the  cost-effectiveness 
of  an  existing  organizational  design  process;  3)  Es¬ 
tablishing  a  framework  within  which  different  organ¬ 
izations  can  evaluate  and  compare  design  method¬ 
ologies,  tools,  and  design  automation  techniques. 
Additional  areas  for  research  are  suggested. 

1.  INTRODUCTION 

As  all  types  of  systems  become  larger,  more  com¬ 
plex,  and  more  dependent  on  the  use  of  sophisticated, 
rapidly  changing  computing  technology,  the  design 
processes  for  such  systems  become  more  complex,  cost¬ 
ly,  and  higher  risk.  Today  new  computers  must  be 
designed  and  built  using  other  computers,  and  new 
software  for  one  computer  is  designed  using  a  larger 
computer  containing  yet  more  complex  software.  With 
labor  costs  increasing  and  digital  technology  costs 
decreasing,  increased  automation  of  the  design  pro¬ 
cess  for  large,  complex  systems  is  becoming  a  ne¬ 
cessity  rather  than  an  option.  The  problem  in  design 
process  automation,  however,  is  that  design  processes 
within  any  given  organization  are  extremely  complex, 
unique,  and  not  well  understood;  also  there  are  no 
commonly  accepted  analysis  disciplines  to  determine 
the  cost/benefits  of  automating  a  design  process. 

Thus  in  most  cases  the  results  of  design  process  auto¬ 
mation  are  less  than  optimal.  (Example:  the  soft¬ 
ware  design  process,  which  always  involves  use  of 
computers,  is  one  of  the  least  understood,  high  risk, 
and  high  cost,  types  of  system  design.)  A  commonly 
accepted  design  analysis  discipline  is  needed.  It 
should  be  capable  of  wide  application  in  different 
organizational  and  technology  environments  and  should 
Improve  the  success  of  any  design  process  automation 
project. _ 

•This  work  was  jointly  supported  by:  U.  S.  Office  of 
Naval  Research  (Contract  N00014-75-C-065Q) ,  U.  S. 

Naval  Air  Development  Center,  and  Design  Analysis 
Associates,  Inc. 


2.  TOWARD  A  DESIGN  PROCESS  ANALYSIS  DISCIPLINE 

During  an  initial  attempt  to  develop  a  design  pro¬ 
cess  analysis  discipline  [l],astudy  of  the  design 
processes  of  various  organizations  and  technology 
environments  resulted  in  the  following  observations: 

Observation  1:  The  most  unique  and  constraining 
factor  in  each  industrial  design 
process  appeared  to  be  the  organ- 
zation,  structure,  and  management 
of  that  process. x 

Observation  2:  In  the  industrial  organization  and 
management  of  each  design  process, 
a  number  of  critical  information 
flow  factors  existed  which  are 
also  critical  in  the  management 
of  other  types  of  industrial  pro¬ 
cesses. 

Observation  3:  The  most  "successful"  (where 

"success"  is  defined  differently 
by  different  organizations)  large 
system  designs  were  those  where 
the  organization  and  management 
of  the  design  process  itself  were 
evaluated  in  terms  of  their  impact 
on  the  system  to  be  designed,  and 
where  necessary  changes  in  the  de¬ 
sign  process  were  made  prior  to  the 
start  of  the  design. 

These  three  observations  pointed  toward  potential 
applicability  of  traditional  information  systems 
analysis  disciplines  and  industrial  dynamics  models 
to  sort  out  the  organizational  from  the  technical 
elements  of  the  design  process. 

In  terms  of  the  traditional  systems  analyst's 
input-process-output  model,  the  design  process  is 
the  "black  box"  (Fig.  1  )  which  transforms  inputs 
(the  system  requirements**)  into  outputs  (the  final 
system  blueprints**) ,  given  a  set  of  limited  re¬ 
sources,  external  constraints,  available  technology, 
and  within  the  boundary  of  a  particular  organization 


Figure  1.  Design  Process  Modeled  as 
Input-Process -Output 
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The  problem  can  be  stated  as  follows: 


4.  THE  DESIGN  PROCESS  ANALYSIS  MODEL 


Design  Process 

Analysis  Problem:  How  to  analyze  and  specify  the 
design  process  "black  box" 

(Fig.  1)  in  such  a  way  that  the 
final  system  blueprints  accur¬ 
ately  and  completely  implement 
the  system  requirements,  given 
technology,  resources,  exter¬ 
nal  constraints,  and  within  the 
the  organizational  environment. 


Given  Observations  1,  2,  and  J  and  the  above  pro¬ 
blem  statement, and  given  the  results  of  several  case 
studies  [1]  ,  we  were  led  to  the  following  assumption. 
It  underlies  the  development  of  our  design  process 
analysis  discipline: 

Assumption:  De  facto  design  processes  already 

exist  and  are  firmly  enmeshed  within 
the  organizational  structure,  busi¬ 
ness  goals,  resource  and  technology 
limitations,  and  management  style 
of  particular  organizations.  New 
theoretical  design  methodologies 
and/or  cost  evaluation  and  prediction 
tools  which  require  changes  in  the 
existing  process  will  not  achieve 
desired  results,  unless  we  first 
analyze  the  existing  design  process, 
its  raison  d'etre  and  its  information 
requirements. 

3.  REQUIREMENTS  FOR  THE  DISCIPLINE 

Our  concept  of  a  design  process  analysis  disci¬ 
pline  involves  a  set  of  rules  or  procedures  that  can 
govern  the  analysis  of  any  design  process,  and  a  top- 
level  conceptual  model  of  the  process  in  which  all  of 
the  key  elements  and  their  interrelationships  can  be 
identified.  The  intent  of  such  analysis  is  "obser¬ 
vation  of  the  real  world  as  it  is  ...  development  of 
best-fit  models  that  describe  but  do  not  explain  be¬ 
havior,  interpretation  of  the  models,  and  the  for¬ 
mulation  and  experimental  confirmation,  modification, 
or  rejection  of  theories  ..."  [5] 

The  design  process  analysis  discipline  should 
provide  a  common  framework  for:  a)  explicit  defi¬ 
nition  and  study  of  the  key  parameters  in  a  specific 
organization's  design  process,  b)  evaluation  and  com¬ 
parison  of  the  cost/benefits  of  various  design  method¬ 
ologies  and  tools,  c)  comparison  of  design  processes 
of  different  organizations  and  for  differing  tech¬ 
nologies,  and  d)  a  starting  point  for  more  formal 
modeling,  analysis,  and  prediction  of  design  cost  and 
risks. 

Based  on  our  initial  work  in  this  area[l]  ,  a 
preliminary  design  process  analysis  model  which  meets 
these  requirements  has  been  developed. 


The  Design  Process  Analysis  Model  is  an  adapt¬ 
ation  of  Forrester's  single-loop  feedback  system 
mode!  [6].  This  model  provides  a  framework  which 
captures  all  the  critical  elements  of  the  design  pro¬ 
cess  and  their  interrelationships.  The  elemental 
single  loop  model  (Fig.  2)  consists  of  two  processes, 
the  Decision  Process  (a),  as  modeled  by  Gorry  and 
Morton  [7]  ,  and  the  Design  Change  Process  (b),  where¬ 
in  the  most  current  system  blueprint  is  modified  by 
the  set  of  design  actions  (c)  initiated  as  output  of 
the  most  recent  decision.  Design  changes  are  embod¬ 
ied  in  design  information  (d)  which  is  fed  back  as 
input  into  the  decision  process  (a).  Decision  infor-' 
mation  (e)  is  also  fed  back  for  input  to  subsequent 
decisions.  The  entire  single-loop  design  process  is 
initiated  externally  by  the  set  of  anals,  resources, 
requirements,  and  constraints  (f)  analysis  represent¬ 
ing  the  output  of  a  higher  level  decision  which 
initialize  the  decision  process,  and  which  are  used 
to  determine  loop  termination.  (Termination  occurs 
at  the  point  when  the  design  and  design  information 
indicates  that  the  requirements  and  goals  for  that 
loop  have  been  met.  Each  iteration  of  the  loop  takes 
a  measurable  amount  of  time  and  consumes  a  measurable 
amount  of  resources.) 
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Figure  2.  Single-loop  Design  Process 
Analysis  Model  Characterizes 
Critical  Elements  of  Design 
Process  and  their  Inter¬ 
relationships 

The  model  distinguishes  among  three  different 
types  of  design  process  information:  1)  Design  (or 
"blueprint")  information  (d)  which  embodies  the  state 
of  the  system  design  at  a  particular  time.  Examples 
are  engineering  drawings,  program  listings,  flow 
charts,  HIPO  diagrams,  etc.  2)  Decision  information 
(e,f)  which  embodies  all  of  the  management  and  con¬ 
trol  information,  requirements  and  constraints,  cri¬ 
teria  for  acceptance  of  the  design,  and  decision  his¬ 
tory.  Examples  are  project  cost  data,  design  speci¬ 
fications,  design  goals,  organizational  objectives, 
project  plans,  progress  reports,  etc.  3)  Actions  (c) 
implemented  by  action  languages  and  media  which  effect 
changes  to  the  design  (b). 
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Design  processes  for  large,  complex  systems  in¬ 
volving  multiple  organizational  elements  can  be  model¬ 
ed  as  a  multiple-loop  design  process  where  individual 
loops  are  interconnected  by  actions  or  information  as 
shown  in  Fig.  3  (Detailed  discussions  of  interconnec¬ 
tion  and  modeling  of  multi-loop  design  processes  will 
be  found  in  l 1 ]  ). 
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Figure  3.  Example  Multi-Loop  Design  Process 

Model  Uses  Decision  Information  Paths 
to  Differentiate  Among  Levels  of 
Design  Detail  and  to  Manage  Design 
Process 

5.  SOME  BENEFICIAL  APPLICATIONS  OF  THE  MODEL 

The  Design  Process  Analysis  Modeling  technique 
(Figs.  2  and  3)  has  proved  useful  in  illustrating  to 
non-technical  managers  the  differences  among  various 
methodologies  and  design  approaches.  The  following 
are  some  simple  examples:  a)  In  structured  program¬ 
ming  the  limitation  to  three  basic  logic  elements 
(sequence,  if-then-else,  and  do-while)  can  be  des¬ 
cribed  in  terms  of  the  model  as  limiting  the  set  of 
design  actions  (Fig.  2c);  b)  The  differences  between 
"top-down  design"  and  "bottom-up  design"  can  be  ex¬ 
pressed  in  terms  of  changes  in  the  arrangement  and 
connectivity  of  the  design  feedback  loops;  in  terms 
of  differences  in  the  goals,  constraints,  and  require¬ 
ments  at  each  design  level;  and  in  terms  of  changes  in 
the  design  and  decision  information  paths;  c)  Use  of 
HIPO  or  SADT  diagrams  would  apply  to  the  format  of  the 
design  information  (Fig.  2d )  ;d )  Chief  programmer  teams 
can  be  expressed  as  a  particular  method  of  intercon¬ 
necting  the  team  members'  decision  and  design  change 
processes  via  the  decision  and  design  information  and 
action  paths;  e)  A  design  walkthrough  is  a  method  for 
closing  certain  design  loops. 


The  model  has  also  proved  useful  in  providing  in¬ 
sights  concerning  improvements  in  the  design  process¬ 
es  of  specific  organizations. 
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Figure  4.  Computer  Logic  Design  Process 
(Hypothetical  Case,  Highly 
Simplified) 


Figure  5.  Revised  Logic  Design  Process 
(after  Design  Analysis) 
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Fig.  4,  for  example,  illustrates  a  highly  oversim¬ 
plified  model  of  a  hypothetical  design  process 
for  a  computer  logic  design  group.  Fig.  5  shows  the 
same  hypothetical  process  as  modified  by  automation 
of  key  elements  (which  were  identified  through  use 
of  design  process  modeling  and  analysis  )+.  In  the 
logic  design  example,  major  benefits  resulted  from 
closing  the  individual  design  loops  earlier  (by  add¬ 
ing  the  logic  simulation  and  synthesis  system  within 
the  organizational  boundary  (Fig.  5).  This,  in  the 
real  world  case,  resulted  in  reducing  the  logic  de¬ 
signers’  turnaround  time  and  the  number  of  logic 
errors  introduced  into  the  equation  file,  which  in 
turn  reduced  design  cost. 

Finally,  the  Design  Process  Analysis  Model  has 
proved  useful  in  developing  more  detailed  models  and 
guidelines  for  certain  types  of  design  processes  in 
systems  which  are  reasonably  alike  and  in  organi¬ 
zational  environments  which  are  also  reasonably  alike. 
For  example,  a  model  and  guidelines  have  been  develop¬ 
ed  for  the  software  development  process  of  embedded 
computer  systems,  via  analysis  of  the  design  pro¬ 
cesses  of  forty-five  separate  product  types  in  eight 
different  industrial  organizations. 

6.  DESIGN  PROCESS  ANALYSIS  PROCEDURES 

As  the  model  has  evolved  through  its  application 
in  various  environments,  a  number  of  procedures  have 
been  developed  to  systematize  the  design  analyst1' 
approach.  These  procedures  are  similar  to  those  used 
in  traditional  systems  analysis  for  automation  of 
information  processing  systems,  except  that  the  De¬ 
sign  Process  Analysis  Model  is  used  as  a  framework. 

The  analysis  procedures  to  be  followed  by  the  de¬ 
sign  process  analyst  include: 

a-  Analysis  of  the  existing  and/or  future  de¬ 
sign  process  environment. 

b.  Understanding  and  documentation  of  present 
design  process  and  its  requirements:  cost, 
time,  and  information  flow,  expressed  in 
both  English  and  in  graphical  form  (c.f. 

Figs.  4  and  5). 

c.  Determination  of  current  design  process  in¬ 
adequacies  and  problem  areas. 

d.  Analysis  of  overall  design  process  infor¬ 
mation  flow,  timing,  content,  format,  as 
well  as  the  desired  acceptable  error  rate 
in  proposed  modifications  or  automation 
of  the  design  process. 

e.  Analysis  of  computerized  end  products  and 
human  end  products  desired  for  the  improved 
or  automated  design  process. 

f.  Analysis  of  similar  or  interfacing  processes 
and  systems  within  the  organization. 

g.  Determination  of  organizational  implications 
and  special  problem  considerations  involved 
in  any  new  or  revised  design  process. 


h.  Identification  and  evaluation  of  all  alterna¬ 
tives  for  modifying  the  existing  design  pro¬ 
cess,  (design  methodologies  and  approaches, 
automated  design  tools,  languages,  etc.)  in¬ 
cluding  requirements  for  all  the  elements  of 
the  design  process  model  (decision  process, 
actions,  design  process,  design  information, 
and  decision  information). 

Consistent  application  of  these  procedures  with¬ 
in  the  framework  of  the  Design  Process  Analysis  Model, 
creates  an  analysis  discipline  which  is  useful  in  im¬ 
proving  the  effectiveness  of  design  processes  for  any 
env->onment. 

7.  PRELIMINARY  CONCLUSIONS  CONCERNING  THE  DESIGN 
PROCESS 

Preliminary  application  of  the  model  -nd  analysis 
procedures  to  several  design  processes  has  led  to  the 
following  conclusions  which  are  currently  being  veri¬ 
fied: 


a.  Technology  and  technical  skill  requirements 
aside,  information  is  the  critical  element 
in  the  design  process,  in  relation  to  both 
the  eventual  quality  and  cost  of  the  system 
being  designed.  As  the  number  of  people  and 
organizational  elements  involved  in  the 
design  process  increases,  the  criticality 
of  the  information  also  increases  (c.f. 

F.  Brooks  observations  in  [8]),  and  there¬ 
fore  the  usefulness  of  design  process  analy¬ 
sis  disciplines  also  increases. 

b.  Many  existing  design  processes  operate  vi- 
tually  open-loop;  that  is,  there  is  no  veri¬ 
fiable  link  (until  late  in  the  process)  be¬ 
tween  either  the  system  design  goals  and 
their  embodiment  in  the  design  blueprint,  or 
the  organizational  management  goals  and  their 
embodiment  in  the  de  facto  design  process, 
(i.e.  the  design  feedback  loop  duration  is 
very  long).  As  the  number  of  people  and 
organizational  elements  involved  in  the 
design  process  increases,  the  payoffs  for 
establishing  shorter  design  feedback  loops 
also  increase  (reference  TRW's  experience 
[9]).  Application  of  design  process  analy¬ 
sis  techniques  prior  to  the  start  of  the 
design  can  be  used  to  identify  potential 
problems  associated  with  lengthy  design 
feedback  loops,  and  to  indicate  where  and 
how  these  loops  might  be  shortened,  (cf. 
differences  between  Figs.  4  and  5). 

c.  Each  organization  and  design  environment  has 
an  extensive  and  specific  set  of  design  and 
decision  information  processing  requirements 
which  are  unique  to  that  organization  and 
its  particular  design  process.  Methodolo¬ 
gies,  techniques,  and  tools  aimed  at  improv¬ 
ing  the  design  process  should  be  evaluated 
in  terms  of  the  specific  organizational  de¬ 
sign  environment,  in  order  to  be  applied 
effectively.  Design  analysis  techniques 
provide  a  framework  within  which  to  compare 
design  methodologies  and  tools,  and  to 
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establish  requirements  for  a  particular  design 
context. 

d.  Evaluation,  analysis,  and  prediction  of  cri¬ 
tical  factors  (such  as  design  cost)  become 
increasingly  difficult  as  the  number  of  people 
and  organizational  elements  involved  in  the 
design  process  increases.  Use  of  design  pro¬ 
cess  analysis  provides  a  means  for  effectively 
sorting  out  the  management,  organizational, 
technical,  and  human  factors  and  their  contri¬ 
butions  to  overall  design  costs. 

8.  RECOMMENDATIONS  FOR  FURTHER  STUDY 

Based  on  our  experience  we  have  identified  three 
major  areas  for  further  study  in  Design  Process  Analy¬ 
sis: 

8.1  Refinement  and  Formalization  of  the  Design 

Process  Modeling  Techniques  and  Analysis 

Procedures. 

To  date,  the  Design  Process  Model  and  Analysis 
Procedures  have  been  used  informally  to  help  capture 
information  concerning  limited  real-world  design  en¬ 
vironments.  In  order  to  facilitate  more  widespread 
general  use  of  these  techniques,  work  needs  to  be 
done  to  make  the  models  and  procedures  more  formal, 
and  to  develop  automated  tools  to  aid  in  analysis 
and  modeling. 

8.2  Extension  of  the  Verification  Base 

The  model  and  procedures  have  been  used  to  analy¬ 
ze  only  two  classes  of  design  processes,  computer 
hardware  design  and  embedded  computer  software  de¬ 
sign.  We  need  to  extend  our  application  of  the  De¬ 
sign  Process  Analysis  Model  and  procedures  to  addi¬ 
tional  classes  of  design  processes  in  order  to  fur¬ 
ther  verify  our  preliminary  observations  and  conclu¬ 
sions. 

8.3  Model  Development  as  Framework  for  Evaluating  and 

Comparing  Design  Methodologies  and  Cost  Esti¬ 
mation  Techniques. 

A  number  of  design  methodologies  and  design  cost 
estimation  techniques  are  being  developed.  Using  the 
Design  Process  Model  as  a  framework,  it  would  be  use¬ 
ful  to  develop  a  standard  approach  (applicable  to 
various  organization  and  design  environments)  for 
determining  the  requirements  for  design  methodologies 
and  cost  estimating  techniques,  and  for  comparing 
them. 

9.  SUMMARY 

In  this  paper  we  have  summarized  some  recent  work 
in  the  development  of  a  Design  Process  Analysis  dis¬ 
cipline,  and  have  described  a  few  of  the  benefits  ob¬ 
served  in  our  preliminary  application  of  the  disci¬ 
pline.  We  have  also  discussed  some  of  our  observa¬ 
tions  and  conclusions  drawn  from  experience  with  a 
Design  Process  Analysis  Model.  A  number  of  recom¬ 
mendations  for  further  research  have  been  made. 
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Abstract 

This  is  a  report  of  a  life-cycle  cost  model 
to  measure  the  effects  of  standardization  of  compu¬ 
ter  instruction-set  architectures  on  military  com¬ 
puter-based  systems.  The  study  considers  six  dif¬ 
ferent  scenarios,  one  of  which  assumes  that  stand¬ 
ardization  is  not  done,  but  only  four  different 
computer  architectures  are  used.  The  remaining 
five  scenarios  consider  the  effects  of  standardiz¬ 
ing  on  each  of  the  architectures  UYK-7,  UYK-19, 
UYK-20,  GYK-12,  and  UYK-41  (PDP-11). 

Standardization  impacts  life-cycle  costs  in 
several  ways.  There  is  an  inherent  difference  in 
the  value  and  utility  of  the  existing  support  soft¬ 
ware  bases  for  the  several  architectures.  The  com¬ 
mercially  supported  archi lectures  will  augment  and 
maintain  a  substantial  portion  of  the  softwar  se 
free  of  government  expenditure.  Finally,  some  ar¬ 
chitectures  are  more  efficient  than  others,  and  re¬ 
sult  in  lower  hardware  costs  if  used  as  a  standard. 

The  cost  model  attempts  to  incorporate  these 
factors  in  a  meaningful  way  to  judge  the  relative 
importance  of  these  factors  and  other  factors  on 
total  life-cycle  cost.  The  results  of  the  model 
show  that  the  GYK-41  results  in  the  least  life- 
cycle  cost  of  any  scenario  over  a  broad  range  of 
annual  rates  of  investment  in  support  software. 
These  conclusions  are  attributed  to  the  fact  that 
the  GYK-41  ranks  best  or  near  best  in  each  aspect 
that  impacts  total  life-cycle  cost. 


Introduction 

The  objective  of  the  study  reported  here  is 
to  measure  the  economic  effects  of  standardization 
of  computer  instruction-set  architectures  on  mili¬ 
tary  computer-based  systems.  For  the  purposes  of 
this  study,  the  term  architecture  refers  to  the 
characteristics  of  a  computer  defined  by  its  in¬ 
struction  repertoire.  Two  computers  are  said  to 
have  the  same  architecture  if  every  assembly  lan¬ 
guage  program  for  one  computer  runs  on  the  other 
and  conversely.  Two  such  computers  may  be  vastly 
different  in  implementation  and  have  radically  dif¬ 
ferent  costs  and  performances.  The  commercial 
computer  world  has  demonstrated  that  a  family  of 
implementations  of  a  single  architecture  is 
feasible  and  desirable. 


Military  computer  systems  are  starting  to 
take  on  characteristics  of  commercial  families  in 
that  prior-generation  computers  are  being  reimple¬ 
mented  with  new  hardware  to  take  advantage  of  the 
technological  improvements  in  cost  and  performance. 
At  issue  is  the  question  of  whether  to  standardize 
on  a  family  of  implementations  of  single  architec¬ 
ture  or  to  use  a  mix  of  many  architectures,  each 
used  in  an  environment  best  suited  to  it.  If 
standardization  appears  to  be  attractive,  than  a 
further  question  is  which  computer  architecture 
should  be  used  as  a  standard.  Standardization  can 
realize  potential  savings  by  eliminating  duplicate 
efforts,  but  on  the  other  hand  it  can  incur  addi¬ 
tional  costs  if  a  standard  is  used  in  an  environment 
for  which  it  is  not  well-suited.  The  question  then 
becomes  one  of  determining  which  standard  is  the 
best  overall  standard. 

The  method  used  in  this  study  to  recommend  a 
course  of  action  is  to  compute  the  relative  life- 
cycle  cost  for  78  representative  Army/Navy  computer- 
based  systems  which  are  acquired  and  deployed  over 
a  22-year  interval.  The  systems  are  acquired  in 
lots  of  26  for  three  different  time  periods--1980, 
1985,  and  1990--with  each  lot  deployed  for  10 
years.  R&D  costs  prior  to  each  acquisition  are  in¬ 
cluded  in  the  cost  model.  Thus  the  cost  model 
serves  to  identify  how  a  standard  computer  archi¬ 
tecture  can  impact  life-cycle  costs,  and  gives  some 
indication  of  the  potential  benefits  and  costs  of 
the  possible  decisions.  Any  model  of  this  type  is 
subject  to  errors  in  estimates,  so  that  the  abso¬ 
lute  dollar  figures  computed  must  be  viewed  as  in¬ 
dicative  of  possible  results  rather  than  as  pre¬ 
dictions  of  the  future.  The  model  is  successful 
in  identifying  the  important  factors,  and  in  es¬ 
timating  their  relative  importance  if  in  fact  the 
model  cannot  be  expected  to  predict  dollar  costs 
with  absolute  accuracy. 

To  isolate  the  key  variable, computer  ar¬ 
chitecture, from  different  hardware  implementations, 
all  computer  systems  are  presumed  to  use  the  same 
family  of  modules  and  chassis  in  their  implementa¬ 
tion.  These  modules  are  presumed  to  be  Military 
Computer  Family  (MCF)  modules  as  specified  by  ITEK 
Corporation  under  contract  to  the  U.S.  Army.  The 
modules  use  a  common  collection  of  memory  modules, 
input/output  modules  and  bus  interfaces,  with  dif¬ 
ferent  CPU  modules  available  to  implement  different 
instruction-set  architectures.  We  presume  that 
these  modules  can  implement  any  of  the  instruction 
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sets  for  the  UYK-7,  UYK-19,  UYK-20,  GYK-12,  and 
UYK-41  (PDP-11)  computers.  The  most  1 ikely  method¬ 
ology  for  realizing  the  collection  of  instruction 
sets  from  a  common  set  of  modules  is  to  use  CPU 
modules  specific  to  each  architecture  that  inter¬ 
face  to  the  memory  and  input/output  modules  over  a 
general  bus. 

The  cost  model  identifies  costs  arising  from 
principle  sources  (a)  conmon  costs,  which  are  costs 
such  as  product  planning,  R&D,  and  support  software 
required  to  mount  an  architecture  in  the  field 
apart  from  costs  for  specific  systems;  (b)  hardware 
life-cycle  costs,  which  include  acquisition,  logis¬ 
tics,  and  maintenance  costs;  and  (c)  software  life- 
cycle  costs,  which  include  initial  acquisition  op¬ 
erations,  and  maintenance  costs. 

A  c>-j:ial  problem  in  making  the  study  is  to 
assess  the  effects  of  the  existing  support  software 
tools  and  the  effects  of  continued  commercial  in¬ 
vestment  in  architectures  for  which  there  are  com¬ 
mercial  counterparts.  To  this  end  we  modeled  the 
effects  as  follows: 

1.  A  fixed  annual  government  expenditure  for  sup¬ 
port  software  is  assumed  to  occur  over  the  life- 
cycle.  The  study  treats  costs  as  a  function  of 
this  level  of  investment. 

2.  Part  of  the  investment  is  used  to  maintain  the 
existing  software  base  that  is  owned  by  the  govern¬ 
ment.  What  remains  after  maintenance  is  used  to 
procure  additional  support  software. 

3.  Commercial  tools  incur  no  charges  against 
government  funds  for  maintenance. 

4.  Although  it  is  likely  that  commercial  invest¬ 
ment  may  augment  the  support  software  base  for  com¬ 
mercial  architectures,  this  model  takes  the  con¬ 
servative  view  that  the  present  commercial  base  re¬ 
mains  fixed  at  its  initial  value  for  the  entire 
life-cycle. 

This  model  assumes  that 

5.  the  greater  the  value  of  the  software  tool 
base,  the  lower  the  cost  per  line  of  code  of  appli¬ 
cations  software.  An  equation  derived  from  actual 
cost  data  is  used  to  predict  this  cost,  and  the  mo¬ 
del  estimates  the  value  of  the  tool  base  as  a  func¬ 
tion  of  time  to  come  up  with  time  varying  estimates 
of  productivity. 

The  results  of  the  model  show  that  at  all 
levels  of  software  investment,  the  GYK-41  leads  to 
the  lowest  life  cost.  The  savings  is  $1.5  billion 
of  the  cost  of  a  standard  family  of  UYK-19  compu¬ 
ters  {22%),  and  about  $5.1  billion  (49" )  of  the 
cost  of  standard  UYK-7  and  GYK-12  families  for  an 
annual  software  investment  of  $2  million.  Regard¬ 
less  how  the  specific  effects  of  crucial  variables 
are  on  total  life-cycle  cost,  the  GYK-41  will  show 
up  most  favorably  in  comparison  to  the  other  ar¬ 
chitectures  because  it  is  at  least  as  good  or  bet¬ 
ter  than  the  other  architectures  in  each  of  three 
key  variables  that  impact  life-cycle  cost. 
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Abstract 

An  overview  of  our  research  activities  in  the 
area  of  software  design  tools  is  presented.  Keyed 
on  the  concepts  of  discipline,  formalism  and 
practicality  of  computer  aided  design  processes, 
and  the  need  for  quality  assessments  at  all  stages 
in  the  life  cycle,  a  concept  of  an  integrated  set 
of  development  and  management  tools  is  presented. 
Fundamental  to  this  concept  is  the  availability  of 
a  "static"  quality  measure  applicable  to  a  spect¬ 
rum  of  specification  levels  of  abstraction  and  the 
use  of  simulation  as  an  integral  part  of  the 
design  process  for  dynamic  analysis  and  validation 

Introduction 

In  considering  the  life  cycle  of  a  software 
system,  various  stages  or  milestones  have  been 
defined  (e.g.,  AFR-800-14),  ranging  from  require¬ 
ments  definition  and  analysis  to  installation 
maintenance.  An  awareness  of  total  life  cycle 
costs  associated  with  software  has  surfaced  a 
need  for  improvements  in  the  techniques  governing 
the  earlier  software  development  stages: 
requirements  identification,  analysis,  and  design 
development  and  validation  applied  through  a 
continuum  of  abstraction  levels  leading  to  coding 
specifications.  Key  objectives  of  such  techni¬ 
ques  include  identification  of  "optimal"  modular¬ 
ization  strategies,  the  surfacing  of  inconsistent, 
incomplete,  or  ill-defined  requirements,  and  the 
validation  of  early  products  in  the  life  cycle. 

An  emerging  vehicle  for  the  accomplishment  of 
these  objectives  are  languages  for  identifying 
requirements,  design  components,  or  design 
specifications.  These  are  collectively  referred 
to  here  as  design  languages.  Embedded  around 
supporting  software  for  language  processing, 
design  data  base  creation,  maintenance,  and 
assessment,  such  schemes  become  a  design  system 
which  can  serve  as  powerful  tools  for: 

•  A  disciplined  way  of  expressing  designs, 

•  A  distinct  phase- to-phase  staging  of  each 
design  task  with  subsequent  traceability  of 
requ i remen ts , 

•  Computer-assisted  documentation  production  as 
an  automatic  by-product, 

•  Management  control  facilities  due  to  the  high 
degree  of  visibility  available  through  design 
data  bases. 
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Several  design  languages  that  apply  at 
different  levels  of  the  design  process,  result  in 
different,  concrete  expressions  of  a  design 
product.  It  is  these  expressions  of  the  design 
product,  stored  as  computerized  data  bases,  which 
provide  the  key  to  a  maximally  controlled  software 
development  activity.  Indeed  the  state-of-the-art 
in  software  engineering  in  general  focuses  on 
design  data  bases  for  some  level  of  software 
express  ion. 

The  availability  of  a  machine  readable  data 
base  containing  design  information  opens  a  whole 
field  of  possibilities  for  learning  more  about 
the  design  product.  Going  far  beyond  code 
analyzers  (which  use  source  code  statements  as 
the  design  level  data  base),  we  now  have  the  means 
for  a  broad  range  of  feedback  generators  to  aid 
the  human  in  iterative  refinement  activities. 

Martin  Marietta's  expertise  in  this  area  is 
strengthened  by  a  vigorous  software  engineering 
research  program  which  continues  to  expand  the 
state-of-the-art.  Specific  recent  accomplish¬ 
ments  have  been  achieved  in  the  areas  of  design 
quality  assessment  as  well  as  very  high  level 
languages  for  requirements  and  design  (14),  (16). 
Very  high  level  languages  allow  an  expression  of 
a  target  system,  i.e.,  a  description  at  some 
level,  which  can  be  used  to  form  a  database.  This 
database  then  provides  us  with  a  tangible, 
measureable  object.  (The  major  related  work  is 
with  static/dynamic  code  analyzers  to  determine 
aspects  such  as  program  structuredness  or 
complexity.  This,  however,  is  after-the-fact: 
design  measures  are  more  cost  effective  if  they 
can  be  used  before  code  is  produced).  Therefore, 
our  current  software  engineering  research  program 
looks  specifically  at  languages  and  features 
tailored  to  the  measurement  goal. 

In  order  to  use  measurement  as  a  mechanism 
for  design  assessment,  some  criterion  is  needed. 
That  is,  while  we  have  the  raw  data  for  measure¬ 
ment,  we  still  need  to  know  what  characteristic 
our  "measure"  will  be  informative  of.  One 
criterion  that  can  be  used  is  one  of  structural 
quality  where  "quality"  can  be  associated  with 
aspects  of  maintainability,  reliability, 
testability,  or  modularity.  The  assessment  scheme 
then  produces  a  "static"  design  quality  metric 
or  index. 
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Before  going  further,  we  emphasize  the 
distinction  between  software  design  as  a  process, 
and  the  design  as  an  end-result  or  product.  Of  the 
many  techniques  proposed  to  positively  influence 
the  design  process,  they  all  are  motivated  by  the 
conroon  goals  of  producing  a  system  that  is  struct¬ 
urally  simple,  maintainable,  and  testable,  i.e., 
modules.  If  indeed,  such  characteristics  can  be 
induced  into  a  design,  the  question  arises  as  to 
how  they  would  then  be  embodied.  Our  hypothesis 
is  that  the  essence  of  such  quality  character¬ 
istics  -  and  others  as  well  -  is  representable  by 
the  structural  aspect  of  the  design.  Furthermore, 
we  believe  that  the  structure  which  supports  or 
"carries"  these  characteristics  can  be  examined 
as  to  the  extent  that  one  or  more  design  goals 
is  met. 

Another  complimentary  criterion  is 
performance ,  which  generally  is  measured  by  some 
level  of  simulation  attacking  aspects  such  as  man- 
machine  interaction,  adequacy  of  computer  hardware, 
etc.  In  the  remainder  of  this  paper,  tools  useful 
to  support  both  these  aspects  of  measurement  will 
be  briefly  outlined. 

Static  Metrics  and  Decomposition 

Design  quality  metrics  have  been  addressed  in 
the  past  from  various  viewpoints.  Andreu  (1)  uses 
a  strength  and  coupling  measure  applied  to  a  graph 
representation  of  requirements  and  their  inter¬ 
relationships  for  preliminary  design.  Myers  (9) 
has  developed  a  model  in  terms  of  probability 
measures  applied  to  discrete  strength  and  coupling 
factors  of  program  modules  which  is  used  to  assess 
the  ramifications  of  making  program  changes. 

Schutt,  et  al  (15)  have  applied  an  information 
entropy  measure  to  hypergraph  represen tat  ions  of 
computer  processes  and  data  structures.  And 
McCabe  (7)  shows  a  method  for  determining  quality 
as  a  function  of  module  "struc tured-ness".  Each  of 
these  approaches  relates  in  some  way  to  a  system 
decomposi tion  measure.  However,  only  Andreu 
directly  addresses  the  problem  of  forming  the 
decomposition  Itself.  Parnas  (11)  also  gives  a 
strategy  for  decomposition  (applied  to  the  design 
process,  rather  than  the  product)  but  does  not 
attempt  a  design  metric  even  though  considerable 
attention  is  given  to  resulting  design  quality. 

Each  of  these  approaches  assumes  its  own 
specific  starting  point  located  somewhere  in  the 
software  development  spectrum.  One  can  think  of 
all  the  points  along  such  a  spectrum  as  various 
levels  of  expression  of  the  design.  And  express¬ 
ion,  as  a  conmuni cation  medium,  will  assume  some 
linguistic  form.  These  forms  progress  from 
natural  language  at  the  early  concept /requ irements 
stage  to  high  order  programing  languages  at  the 
implementation  stage.  Rather  than  limit  ourselves 
to  a  specific  form  or  level  of  expression,  we  would 
like  it  to  be  arbitrary.  In  order  to  apply  a 
design  strategy  and  measurement  scheme  to  a  given 
expression,  a  transformat  ion  must  produce  an 
encoded  form  of  expression  to  which  methods  for 
the  analysis  of  structure  and,  eventually,  quality 
can  be  applied.  The  transformation  process 


establishes  a  graph  by  identifying  and  converting 
appropriate  linguistic  objects  to  nodes  (vertices) 
and  relationships  to  link  (edges).  Graph  links 
between  nodes  must  also  be  qualifiable  as  to 
strength,  type,  or  Importance;  i.e.,  weighted. 

The  design  process  is  equated  to  the  decom¬ 
position  of  this  graph  into  a  collection  of  system 
"elements"  (modules,  programs,  subsystems,  etc.), 
i.e.,  a  clustering  activity,  which  can  be  sub¬ 
jected  to  standard  metric  analysis.  Measures  such 
as  "strength"  or  "coupling"  or  "complexity"  of  a 
structure  are  generalized  in  this  framework  and 
used  at  varying  levels  of  abstraction. 

For  a  fixed  class  of  problems  we  have 
determined  how  the  choice  of  a  metric  will 
influence  the  quality  measure  produced  -  an 
adjacency  matrix  produces  decomposition  based  on 
the  number  of  interfaces  between  and  within 
clusters.  A  distance  matrix  produces  a  similar 
result  but  is  a  more  comprehensive  criterion 
applicable  to  highly  complex  graphs.  Other  metrics 
which  represent  links  as  probability  values, 
information  channel  quanta,  or  component  control 
lines,  would  need  further  research  to  determine 
the  characteristics  of  structure  that  quality 
indexes  based  on  such  metrics  would  reflect. 

Simulation  Tools 


Our  design  approach  includes  a  dynamic  design 
validation  at  key  stages  in  the  design  process. 

Two  such  stages  are  the  definition  of  the  man- 
machine  interface  (MMX)  and  ADP  resource  require¬ 
ments  (timing,  memory,  1/0  parallelism,  etc.)  for 
the  system.  Key  tools  here  include  "virtual" 

I/O  device  simulation,  functional  simulation 
languages,  enulations,  etc.  Research  in  this 
area  has  been  directed  at  the  interface  definit¬ 
ions  which  allow  such  simulations  to  become  a 
natural,  cost-effective  element  of  the  design 
pr'bcess.  For  example  a  top  level  clustering 
analysis  might  support  a  delineation  between 
human  operator  and  computer  functions  for  an 
interactive  application.  In  addition  to  design 
validation  from  the  human  factors  point  of  view, 
a  simulation  driven  by  a  requirements/design  data 
base  could  surface  key  dara/process  structure 
implications  and  drive  the  test  scenario  develop¬ 
ment.  Likewise,  tools  such  as  functional  simul¬ 
ation  of  hardware  and  software  can  surface  key 
resource  allocation  problems  associated  with  the 
chosen  hardware  environment,  if  driven  directly 
from  the  problem  data  base.  Such  interfaces  form 
the  basis  for  an  integrated  software  design 
facility  concept  described  in  the  attached 
L igure. 
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PROGRAMMERS  ARE  TOO  VALUABLE  TO  BE  TRUSTED  TO  COMPUTERS 
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ABSTRACT 

Most  of  the  anticipated  gains  from  program¬ 
ming  tools  have  been  slow  in  arriving,  and  often 
disappointing  when  they  arrive.  A  major  cause 
of  this  condition  is  the  failure  to  understand 
the  processes  by  which  new  technology  is  intro¬ 
duced.  The  role  of  training  has  been  left  by 
default  to  computers,  under  the  assumption  that 
they  are  cheaper  or  better  than  human  teachers. 
Methods  of  learning  without  using  either  comput¬ 
ers  or  teachers  are  not  even  considered,  though 
they  are  far  more  economical  and  effective.  The 
computer  can  have  a  role;  the  teacher  can  have 
a  role;  but  unless  the  overall  climate  for  pro¬ 
fessional  learning  is  established,  the  computer 
and  the  teacher  are  hopelessly  inadequate  to  the 
full  job  of  creating  a  corps  of  professional 
programmers. 


Choosing  a  Teaching  Language 

Every  four  years,  along  with  county  elec¬ 
tions, the  local  computer  science  professors  raise 
the  question  of  the  correct  teaching  language 
for  programming.  There's  a  lot  of  brave  talk 
about  throwing  the  rascals  out,  many  lunches 
devoted  to  campaigning,  a  wave  of  confidence 
just  before  the  election,  and  then  the  ultimate 
defeat  of  the  upstart.  In  the  end,  both  FORTRAN 
and  the  sheriff  are  reelected.  They  may  be  cor¬ 
rupt;  they  may  be  incompetent;  they  may  be  creak¬ 
ing  with  age;  but  they're  at  least  familiar. 

Like  the  county  voters,  professors  are  quite 
ready  to  rationalize  the  result.  As  the  years 
since  1956  have  accumulated,  the  points  in  these 
arguments  have,  one  by  one,  withered  away.  And  yet 
one  remains,  year  after  year,  the  backbone  of 
conservatism  everywhere.  To  quote  a  genuine 
professor : 

"My  decision  to  base  this  course  on  the 
WATFIV  programming  language  was  founded 
not  only  on  a  recognition  of  real-world 
applications  but  also  on  the  raw  econo¬ 
mics  of  computer  costs  at  our  installa¬ 
tion." 

"Real-world"  and  "raw  economics"  are  no- 
nonsense  words — none  of  your  ivory-tower  fol-de- 
rol.  Let's  look,  therefore,  at  some  of  the  "raw 
economics"  of  computer  costs  for  training  in  the 


"real  world". 

Twenty  years  ago,  computers  were  really  ex¬ 
pensive — so  expensive  that  FORTRAN  had  to  be 
ruled  out  of  consideration  for  the  language  used 
in  a  programming  course.  In  fact,  even  the  rich¬ 
est  of  the  rich  thought  using  a  computer  in  a 
programming  course  was  a  frivolous  extravagance. 

Twenty  years  is  a  long  time  in  a  sheriff's 
life,  and  even  longer  in  the  life  of  the  program¬ 
ming  profession.  Few  remember  those  days,  or 
even  believe  that  it  was  possible  to  teach  pro¬ 
gramming  without  a  computer.  Soon,  at  the  rate 
personal  computers  are  spreading,  few  will  remem¬ 
ber  what  it  was  like  to  learn  computing  with  a 
teacher .  In  the  four  years  since  that  WATFIV 
decision  was  rationalized,  the  "raw  economics" 
have  changed  so  sharply  that  we  could  almost 
afford  to  give  each  student  a  personal  computer. 
In  this  year  of  1978,  it  sounds  archaic  to  say 
"our  installation"  in  the  singular.  What  univer¬ 
sity  worth  accred i t al ion  hasn't  got  a  dozen  or 
more  minis  and  micros  scattered  around  the  campus 
1  ike  Homecoming  handbi  Us? 

Still,  the  FORTRAN  argument  has  survived  the 
latest  reduction  in  hardware  costs.  Now  it  is 
the  micros  that  can't  afford  to  run  anything 
else — except,  heaven  forbid,  BASIC.  Where  does 
this  conservatism  originate?  Where  will  it  end? 
How  far  will  it  spread?  Why  does  it  resist  the 
repeated  efforts  of  language  designers  and  imple¬ 
mentors  to  break  It  down? 

Using  Software  Tools 

The  programming  language  is  merely  the  oldest 
and  most  familiar  software  tool,  and  universities 
are  merely  the  oldest  and  most  familiar  tools  for 
social  change.  The  same  conservatism  in  the 
adoption  of  new  tools  is  found  for  all  other 
tools  in  all  other  institutions.  If  we  can  an¬ 
swer  our  questions,  the  payoff  could  be  stagger- 
tng. 

Of  the  many  reasons  for  non-use  of  new  tools, 
perhaps  the  roost  obvious  is  the  lack  of  attention 
Riven  to  training.  An  elephantine  sum  of  money 
has  been  spent  on  the  development  of  software 
tools-  the  current  rate  probably  exceeds  a  bil¬ 
lion  dollars  a  year.  In  contrast,  a  micro- 
organismlc  sum  of  money  has  been  spent  on  train¬ 
ing  people  to  use  those  tools. 
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Most  of  the  tools — perhaps  as  a  consequence 
of  this  disparity  between  development  and  train¬ 
ing — are  never  or  hardly  ever  used.  Not  only 
do  people  continue  to  use  FORTRAN,  but  they  con¬ 
tinue  to  use  it  without*  for  instance,  even  get¬ 
ting  cross-reference  listings  of  their  program 
variables.  Even  when  their  FORTRAN  compiler 
provides  such  a  cross-reference,  the  installa¬ 
tion  disables  it,  usually  as  a  "standard", 
because  "it  costs  too  much".  And,  even  where 
it  is  routinely  produced,  90%  of  the  programmers 
never*  look  at  it— yet  the  cross-reference  list¬ 
ing  is  one  of  the  simplest  software  tools,  one 
of  the  most  direct  in  its  use,  one  of  the  most 
convenient,  and  one  of  the  most  ancient. 

The  situation  is  little  better  in  the  major¬ 
ity  of  installations  that  have  abandoned  FORTRAN 
for  "higher"  languages.  As  an  exercise  in  for¬ 
mal  review  techniques,  our  clients  and  students 
study  published  programs,  which  presumably  are 
held  up  as  examples  for  novices  to  follow.  In 
a  typical  review,  of  a  program  by  two  PhD  Pro¬ 
fessors  of  Computer  Science,  we  studied  the  use 
of  PL/I.  Although  the  specificat ions  offered 
ideal  situations  for  employing  each  of  them, 
none  of  the  following  PL/I  facilities  were  used: 

1.  dynamic  allocation  of  storage 

2.  cross-section  notation 

3.  array  expressions 

4.  factoring  of  attributes 

5.  subscript  expressions 

6.  control  of  type  conversions 

7.  bit  strings 

One  could  almost  have  removed  the  semicolons 
and,  in  effect,  compiled  the  program  on  a 
FORTRAN  compiler. 

Further  examination  of  the  program  revealed, 
as  is  typical,  no  impact  of  the  years  of  dis¬ 
cussion  of  programming  style .  Among  the  more 
pitiful  stylistic  practices  we  found: 

1.  intricate  branching,  including  into  and 
out  of  loops 

2.  superfluous  statements  based  on  incom¬ 
plete  understanding  of  the  action  of  earlier 
statements 

3.  initialization  of  variables  upon  exit 
from  loops  that  used  them 

4.  use  of  single-character  names  such  as 
K  and  R 

5.  use  and  reuse  of  scalar  variables  in  a 
program  not  pressed  for  storage 

6.  use  of  a  name  with  a  different  meaning 
in  the  program  and  in  a  comment  explaining  its 
meaning 

7.  use  of  the  keyword,  PTR,  as  a  data  name 
in  a  most  confusing  context 

8.  general  inconsistency  in  the  naming  of 


variables 

9.  computations  inefficiently  placed  within 
loops,  yet  rendered  inaccessible  to  an  optimizer, 
in  a  program  that  heavily  used  pointers  "for 
efficiency" 

At  the  level  of  design ,  the  program  again  showed 
no  influence  of  recent  discussion  in  the  industry, 
let  alone  design  tools  and  concepts.  We  found: 

1.  no  checking  whatsoever  for  valid  input, 
either  bounds  or  values 

2.  unchecked  input  used  to  control  calcula¬ 
tions*  as  in  computed  branches 

3.  a  completely  undesigned  and  error-prone 
input  format 

4.  an  algorithm  which  was  inefficient  for 
all  but  small  cases  of  input 

5.  no  monitoring  of  performance  of  the  al¬ 
gorithm,  which  might  have  indicated  loss  of  per¬ 
formance  to  the  user 

6.  incomprehensible  error  messages 

7.  comprehensible  error  messages  that  were 
wrong  or  misleading 

In  our  work,  we  have  reviewed  hundreds  of 
programs  from  dozens  of  installations.  The  pro¬ 
grams  display  approximately  the  same  range  of 
problems,  and  the  installations  displav  approx¬ 
imately  the  same  non-use  of  tools.  At  least  75% 
of  the  installations  routinely  debug  using  unfor¬ 
matted  hexadecimal  dumps.  At  least  90%  have 
never  used  a  preprocessor.  Program  libraries  are 
coming  into  use,  but  not  in  more  than  50%  of 
installations.  Test  data  generation  is  rare; 
archiving  of  test  data  even  more  rare;  and  even 
a  rudimentary  data  set  comparison  is  hardly  ever 
used. 

We're  not  speaking  of  the  number  of  these 
tools  that  sit  on  the  shelf,  accumulating  dust 
and  rent.  We  are  counting  the  tool  as  "used" 
if  someone  uses  it,  even  if  for  a  minor  part  of 
its  capabilities,  so  these  figures  overestimate 
real  professional  use.  Since  we  are  spending 
some  money  on  training,  what  can  it  possibly  be 
teaching  them? 

The  Computer  as  Instructor 

In  view  of  the  inadequate  use  of  our  expen¬ 
sive  tools,  it  is  obvious  that  we  are  not  teach¬ 
ing  programmers  to  use  the  computer  adequately 
in  their  programming  work.  The  conclusion  seems 
obvious — we  must  spend  more  on  computers  in  our 
classes,  not  less. 

Obvious ?  It  would  seem  so,  until  we  look  at 
the  way  the  computer  is  being  used  in  classes. 
Indeed,  we  are  not  teaching  them  at  all — the 
computer  is  carrying  the  burden  for  us.  As  a 
result,  we  find  ourselves  standing  on  both  sides 
of  the  same  fence.  It  is  the  double  thesis  of 
this  essay  that 
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1.  The  computer  is  insufficiently  used  in 
programmer  education. 

2.  The  computer  is  far  overused  in  program¬ 
mer  education. 

Let's  examine  how  the  computer  is  used  in  a 
typical  university  course  in  programming.  The 
nature  of  this  use  may  best  be  understood  in 
terms  of  an  analogy.  Suppose  we  have  succeeded 
In  developing  a  "paper-grader"  program  for  high- 
school  English  courses,  and  that  we  have  succeed¬ 
ed  in  getting  ^he  program  used  within  the  high- 
schools  in  the  following  way: 

1.  The  teacher  lectures  on  one  topic  or 
another  to  50-500  students. 

2.  The  teacher  gives  an  assignment  to  write 
an  essay. 

3.  The  students  write  an  essay,  under 
strict  orders  not  to  help  anyone  else  or  to 
receive  help  from  anyone  else. 

4.  The  essay  is  graded  by  our  computer 
program  on  the  basis  of 

a.  spelling  errors 

b.  grammatical  errors 

5.  The  paper  is  returned  to  the  students 
with  the  grade. 

>What  do  you  suppose  the  students  will  learn? 

The  reason  this  analogy  is  good  is  that  we 
don't  have  to  guess  what  the  students  will  learn. 
We  already  know.  Even  without  the  computer, 
this  is  the  way  many  high  school  English  class¬ 
es  teach  composition,  and  the  results  are  noto¬ 
riously  bad.  Many  of  the  students  learn  to 
spell;  some  learn  to  avoid  incorrect  grammar; 
essentially  none  learn  to  communicate. 

In  programming  education,  the  "paper  grader" 
is  already  built  into  the  compiler.  Because  pro¬ 
gramming  assignments  must  be  recycled  through 
the  grader  until  all  "spelling  and  grammar" 
errors  are  eliminated,  the  emphasis  on  these 
aspects  is  all  the  stronger.  In  many  classes, 
the  professor  has  no  time  or  stomach  for  reading 
the  actual  programs.  In  some,  not  even  the 
outputs  are  read.  In  such  classes,  students 
turning  in  wrong  outputs  never  find  out  from 
the  instructor.  Students  turning  in  fake 
outputs  arc  never  found  out  by  the  instructor. 

To  solve  the  problem  of  unread  output, 
sophisticated  schools  have  developed  "grader" 
programs  which  exercise  the  student  programs 
using  test  inputs  and  scoring  the  resultant 
outputs.  Graders  definitely  raise  the  level  of 
computer  assistance — but  mostly  to  the  harried 
instructor,  not  to  the  student.  After  all,  for 
classes  with  hundreds  of  neophytes,  how  else 
can  the  legions  of  warm  bodies  be  handled  econ¬ 
omically,  or  handled  at  all. 

Grader  programs,  to  give  them  their  due, 
actual Iv  can  represent  an  advance  over  the  simple 
compiler  checking  that  most  schools  still  use. 


But,  again,  what  is  it  they  teach?  By  reviewing 
programs  produced  by  these  students,  in  class 
and  years  later,  you  can  learn  what  the  computer 
teaches  about  programming: 

1.  accurate  card  punching,  in  a  batch  envi¬ 
ronment 

2.  use  of  a  text  editor  to  overcome  inaccu¬ 
rate  keying,  in  an  on-line  environment 

3.  spelling  of  keywords 

4.  consistent  spelling  of  prog rammer- chosen 
words 

5.  a  subset  of  the  syntax  of  a  programming 
language 

But  beyond  these  important  lessons,  the  computer — 
used  in  these  ways — teaches  a  much  deeper  lesson, 
one  that  will  remain  long  after  the  WATFIV  syntax 
is  forgotten.  That  lesson  is: 

Programs  are  shown  to  be  correct  by  testing  them. 

This  is  a  curious  lesson.  To  quote  the  ori¬ 
ginal  sermon  on  structured  programming  by  Edsger 
Dijkstra: 

"...the  extent  to  which  the  program 
correctness  can  be  established  is  not 
purely  a  function  of  the  program's 
external  specifications  and  behaviour 
but  depends  critically  on  its  internal 
structure." 

In  short,  what  the  computer  is  teaching  is  pre¬ 
cisely  antithetical  to  the  principal  lesson  of 
the  strongest  movement  for  improved  programming 
since  the  invention  of  the  assembler.  It  is 
teaching  this  lesson  every  day,  in  every  school 
in  the  country,  to  thousands  and  thousands  of 
present  and  potential  programmers. 

But  there  is  a  second  high-level  lesson* — a 
meta-lesson,  actually — a  lesson  about  learning 
itself: 

We  learn  to  program  by  throwing  garbage 
into  a  computer  and  seeing  what  comes  out. 

This  lesson  is  like  the  prejudices  of  our  youth — 
deeply  set  and  hard  to  change.  What's  more,  it's 
being  taught  earlier  and  deeper,  now  that  person¬ 
al  computers  are  so  readily  available.  What  are 
we  going  to  do  with  the  next  generation  of  pro¬ 
grammers? 

Render  unto  the  computer. . . 

Computers  are  excellent  at  teaching — about 
computers.  For  instance,  no  amount  of  lecturing 
about  syntax  errors  seems  to  make  the  slightest 
impression  on  the  majority  of  students--unless 
it  is  a  slightly  negative  one.  A  compiler, 
though,  patiently  and  mercilessly  teaches  syntax 
to  one  recalcitrant  student  after  another. 

Cleverly  used,  it  can  even  motivate  some  students 
to  learn  princip i es  of  syntax,  though  they  have 
to  obtain  those  principles  elsewhere. 
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Some  toolmakers  say  that  syntax  and  spelling 
are  unimportant  lessons,  because  tools  can  be 
built  to  correct  any  errors.  I  can  agree  with 
them  only  partly,  for  no  system  will  ever  be  able 
to  correct  all  errors  of  any  type.  Consequently, 
even  the  most  brilliant  programmer  must,  at  some 
time,  learn  the  hard  lessons  of  syntax  and  spell¬ 
ing — or  else  waste  hundreds  of  frustrating  hours. 

Furthermore,  even  when  the  most  sophisticated 
and  careful  proof  techniques  are  applied  to  the 
most  magnificently  structured  program,  the  comput¬ 
er  may  reveal  two  kinds  of  error  that  can  slip 
through.  First,  there  are  the  simple  proof¬ 
reading  errors — errors  that  plague  the  most  advan¬ 
ced  mathematics  journals  as  well  as  the  most 
humble  programs.  Second,  there  is  the  complete 
misunderstanding  of  the  problem. 

Though  a  programmer  may  prove  that  the  pro¬ 
gram  does  what  she  things  it  does,  she  can  never 
prove  that  what  she  thinks  it  should  do  is  what 
the  user  or  users  wanted  it  to  do.  In  one  sense, 
there  are  no  wrong  programs,  only  different 
programs.  The  only  hope  we  have  of  discovering 
if  we've  solved  the  right  problem  is  by  giving 
the  proposed  solution  back  to  the  originator (s) . 

Consider  the  following  utterly  typical 
example.  A  professor  of  archaeology  was  teach¬ 
ing  a  large  introductory  course,  assisted  by  the 
computer  to  the  extent  of  its  printing  individ¬ 
ualized  examinations  drawn  at  random  from  a  pool 
of  questions.  An  advanced  student  in  the  com¬ 
puter  science  department  had  been  given  the  job 
of  writing  the  program  to  print  the  exams,  but 
the  program  had  one  slight  flaw.  Rather  fre¬ 
quently,  one  exam  contained  the  same  question 
twice,  or  even  three  times.  Rather  than  sampl¬ 
ing  the  question  pool  "without  replacement",  the 
program  was  sampling  "with  replacement". 

The  professor  noticed  this  defect  and  con¬ 
fronted  the  student.  He  was  told  that  the  pro¬ 
gram  could  sample  without  replacement,  but  that 
the  process  was  "very  inefficient".  He  knew 
that  the  professor  certainly  would  not  want  to 
pay  the  extra  cost,  so  It  would  be  better  to 
print,  say  14  questions  to  be  sure  of  getting  10 
unique  ones.  The  students  just  had  to  be  told 
to  answer  "the  first  10  unique  questions". 

The  reader  may  want  to  speculate  just  which 
of  the  many  poor  algorithms  this  student  program¬ 
mer  had  chosen.  In  fact,  the  technique  was 
conceptually  very  simple.  When  sampling  "without 
replacement"  was  specified,  the  program  would 
produce  a  tentative  exam  and  then  test  to  see  if 
it  contained  any  repeated  questions.  During 
program  test,  10  questions  were  being  drawn  from 
a  pool  of  20,  which  meant  that  about  30  exams  had 
to  be  drawn  to  get  one  without  duplicates! 

We  must  be  careful  to  draw  the  correct  moral 
from  this  tale.  Most  professors  of  computer 
science  would  lament  this  miserable  student’s 
ignorance  of  algorithms.  Although  that  kind  of 
ignorance  is  sad,  we  could  learn  to  live  with  it 


In  the  "real  world".  What  we  can't  live  with  is 
this  student's  ignorance  of  the  programmer's  role 
in  life. 

It  was  not  for  the  programmer  to  decide  how 
much  the  professor  wanted  to  spend  on  exams.  His 
guess  about  what  the  professor  wanted  led  him  to 
authorize  printing  of  several  thousand  exams  that 
had  to  be  discarded.  The  deadline  was  missed, 
and  the  old  system  of  a  typed  exam  had  to  be 
used.  Unless  one  is  programming  for  one's  own 
amusement,  the  final  decision  about  whether  or 
not  a  program  is  correct  rests  not  with  the  com¬ 
puter,  not  with  the  programmer,  but  with  the 
party  with  the  problem.  And  woe  unto  us  as  the 
hordes  of  computer  hobbyists  hit  the  professional 
ranks — for  them,  all  programming  is  strictly  for 
personal  amusement . 

Another  relevant  point,  of  course,  is  that 
the  programmer  had  not  the  slightest  idea  that 
his  program  was  wrong.  He  didn't  even  know  it 
was  wrong  from  an  "efficiency"  point  of  view. 

This  lesson,  neither  the  computer  nor  the  pro¬ 
fessor  could  teach. 

There  are,  in  the  end,  a  multitude  of  lessons 
one  must  learn  to  become  truly  a  professional  pro¬ 
grammer.  Each  such  lesson  has  its  own  character¬ 
istic  ways  of  being  learned.  In  designing  pro¬ 
gramming  courses,  we  must  see  that  each  lesson 
is  taught  in  the  most  effective  way,  and  not 
merely  that  a  certain  number  of  students  can  be 
"processed"  for  a  whole  semester  without  demand¬ 
ing  a  tuition  refund. 

How  to  Teach  a  Programming  Course 
With  or  Without  a  Computer 

How  do  we  design  such  a  programming  course? 

The  general  pattern  should  now  be  clear: 

1.  Use  the  computer  to  teach  what  only  the 
computer  can  teach. 

2.  Use  people  to  teach  what  only  people 
can  teach. 

3.  Make  the  "economic"  choice  between  com¬ 
puter  and  people  only  in  those  cases  where  either 
can  do  the  job  with  equal  effectiveness . 

No  doubt  an  introductory  course  ought  to 
begin  with  one  or  two  encounters  with  the  machine, 
to  teach 

1.  the  overall  process  by  which  programs 
get  created 

2.  the  finicky  nature  of  computing  machines 

3.  the  mismatch  between  such  machines  and 
our  abilities  to  be  precise 

After  these  lessons  are  taught,  at  least  on 
an  introductory  level,  the  class  should  turn  to 
its  human  resources  to  learn  more  difficult 
lessons.  A  typical  assignment  might  involve: 
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1.  a  problem  posed  by  the  instructor  acting 
as  "user"  but  with  certain  lessons  in  mind 

2.  each  student  making  a  trial  solution,  on 

paper 

3.  each  student  evaluating  the  trial  solu¬ 
tion  of  another  student 

4.  small  groups  of  students,  sometimes  with 
instructor  supervision,  arriving  at  a  composite 
solut ion 

5.  sometimes  trying  the  composite  on  the 
machine 

6.  groups  exchanging  solutions  for  more 
formal  review,  sometimes  guided  by  the  instructor 
in  front  of  the  entire  class 

In  such  a  class,  the  students  ingest  a  much 
richer  and  more  meaningful  diet  of  programming 
information.  They  avoid  time  wasted  on  syntax 
and  spelling,  which  are  corrected  as  a  byproduct 
of  the  informal  evaluations — and  which  in  any 
case  could  be  taught  by  machine  except  for  the 
fact  that  it's  actually  cheaper  and  more  effect¬ 
ive  to  do  it  through  reviews!  From  student  to 
student,  from  teacher  to  student,  passes  know¬ 
ledge  about  t.?,  about  lanouiiic,  about  oloov- 

:  thne ,  about  design t  about  tools,  about  hundreds 
of  little  pieces  of  programming  "widsora"  that 
collectively  set  apart  the  professional  from  the. 
amateur.  And  were  it  not  too  presumptuous,  we 
might  have  mentioned  that  once  in  a  while  know¬ 
ledge  even  passes  from  student  to  teacher. 

But  what  of  the  "raw  economics"?  We've 
speculated  about  this.  We've  experimented  with 
it.  Our  experiments  surpass  even  our  wildest 
speculations.  Of  course,  the  first  saving  comes 
because  teams  of  5  students  run  only  one-fifth 
the  number  of  assignments  through  the  computer. 

Yet  the  lessons  for  each  student  are  far  greater 
than  the  old  secretive  method,  for  each  student 
sees  many  approaches,  not  just  one. 

Secondly,  the  number  of  runs  to  produce  a 
correct  program  (and  not  just  a  "working"  pro¬ 
gram)  is  drastically  reduced.  The  magnitude  of 
reduction  depends  somewhat  on  the  size  of  the 
problem,  but  a  typical  figure  for  student  batch 
problems  is  from  an  average  of  20  runs  to  an 
average  of  2.  Actually,  n:osi  programs  run  cor¬ 
rectly  the  first  time  on  the  machine — «T >:•/  the 
students  can  demonstrate  it.  They  had  better  be 
able  to;  if  not,  the  other  teams  cut  it  to  pieces. 

Thirdly,  the  programs  themselves  tvpicallv 
will  run  more  ef f iclent ly--onre  attention  is  on 
design,  rather  than  grammar  and  spelling.  The 
factor  of  30  lost  by  the  archaeologist's  program¬ 
mer  would  not  be  untypical,  but  suppose  we  modest¬ 
ly  put  this  factor  as  2.  Putting  the  three  fac¬ 
tors  together  we  reduce  machine  costs  bv  a  factor 
of  about  100  (5  x  10  x  2).  Certainly  that's 
enough  "raw  economics"  to  permit  us  to  choose 
our  programming  tools  on  the  basis  of  what  they 
will  teach,  not  what  thev  will  cost. 


with  machine  economics.  We  can  use  this  method 
quite  successfully  when  there  is  no  compiler  for 
the  language  we  want  to  teach,  when  there  is  no 
compiler  at  all,  and  vho*:  ffinv  is  >:v  o\:ss. 

When  the  probability  of  a  program  working  the 
first  time  is  so  high,  in  many  circumstances  ther 
is  not  much  to  be  gained  from  actually  running 
the  program.  Getting  rid  of  the  computer,  or  at 
the  very  least  controlling  the  incredible  varianc 
it  usually  introduces,  permits  us  to  plan  with 
more  confidence,  and  to  stick  with  the  plan  as 
the  semester  unfolds.  Indeed,  once  freed  of  the 
constraints  of  the  machine,  the  course  economics 
depend  mostly  on  the  availability  of  teacher  and 
classroom,  but  once  beyond  the  barest  introdue- 
t  ion,  group  of  programmers  can  teach  * 

selves  in  this  way,  on  t he  iob  or  off. 

Actually,  the  economics  of  the  on-the-job 
training  reverses  the  usual  college  assumptions. 
The  students  are  likely  to  be  making  higher  sal¬ 
aries  than  the  instructor,  and  certainly  are  when 
considered  as  a  class.  Tt  simply  doesn't  pay  to 
transport  professional  programmers  to  a  cross¬ 
town  campus  for  programming  lectures  when  they 
can  learn  much  more  in  less  time  by  inspecting 
each  others'  work. 

There  is  an  implicit  challenge  here,  to  the 
professional  teacher  who  wants  to  stay  out  of 
the  unemployment  lines.  With  the  vi^hi  leader, 
a  small  group  of  programmers  can  multiply  their 
learning  through  review  techniques.  The  leader 
has  to  help  them  invest  the  money  saved  on  lec¬ 
tures  and  machine  time  in  fruitful  alternatives, 
such  as  instrumented  runs  that  permit  them  to 
make  design  and  algorithm  comparisons  that  would 
he  difficult  to  perform  analytically. 

The  leader  can  give  them  more  problems  to  do 
in  the  same  time,  or  guide  them  in  exploring 
more  alternatives  on  the  same  problems.  In  this 
way,  the  students  can  be  nudged  along  the  path  to 
design,  rather  than  to  even  more  obscure  bit- 
twiddling.  Tc  >ls  can  be  evaluated  in  actual  use, 
perhaps  giving  some  return  for  the  billions  we've 
invested  in  them.  But  it  will  take  a  lot  of 
running  for  an  instructor  to  stay  ahead  in  such 
an  environment. 

This  new  environment  may  not  suit  the  old- 
fashioned  teacher  who  sees  the  instructor  job  as 
a  kind  of  Olivier  playing  Hamlet  to  packed  houses 
of  sleeping  students.  Neither  may  it  suit  the 
typical  computer  jockey  whose  narrow  mind  and 
dogged  persistence  have  served  to  get  so  many  A's 
from  the  usual  programming  classes.  If  such 
people  are  thus  encouraged  to  leave  the  program¬ 
ming  profession,  it  can  be  counted  as  another 
plus  for  this  system  of  professional  education. 

Some  Future  History 

1 f  we  study  the  development  of  other  high 
technology  fields,  such  as,  electrical  engineer¬ 
ing,  machine  tools,  telephonies  steam  power, 
and  printing,  wo  sec  th.it  output 1  'g  is  not  unique 
We  mav  be  going  through  the  stages  faster,  hut 


J 


But  the  benefits  to  education  do  not  stop 
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we’re  going  through  them  all  the  same.  Our  first 
quarter  century  has  been  fast-paced,  and  driven 
almost  entirely  by  the  "technological  imperative". 
Hardware  sales  have  been  the  alpha  and  omega  of 
our  narrow  world.  Anything  that  didn't  promote 
the  sale  of  more  hardware  was  left  behind  in  the 
rush  for  survival  in  the  technology  jungle. 

For  the  past  few  years,  since  IBM  "unbundled" 
the  sales  managers  have  come  to  understand  that 
software  is  a  product,  just  like  hardware — but 
with  an  even  greater  potential  market.  The  soft¬ 
ware  rush  is  now  in  its  adolescent  stage,  and 
repeating  the  entire  hardware  history  at  an 
even  faster  pace. 

Yet,  today,  after  an  investment  of  perhaps 
200  billion  dollars  in  hardware  and  software, 
the  achievements  are  relatively  small.  Relative 
to  what?  Relative  to  what  the  future — if  history 
is  any  guide — holds  in  store.  One  measure  of 
our  immaturity  is  the  lack  of  any  true  program¬ 
ming  profession.  The  career  path  of  choice  for 
programmers  today  is  "up  and  out".  For  most, 
there  is  nothing  to  learn  after  mastery  of 
grammar  and  spelling,  so  there  is  little  economic 
incentive  to  retain  a  higher-paid  "experienced" 
programmer  where  a  freshly  trained  beginner  is 
available. 

And,  when  an  individual  does  surpass  this 
narrow  training,  there  is  nobody  in  management 
to  rec'ognize  how  valuable  such  a  professional 
really  is.  The  managers,  after  all,  once  took 
a  "programming"  course.  They  know  that  program¬ 
ming  is  unprofessional,  shallow,  and  unmanageable. 
They  know  that  money  spent  on  training  is  wasted, 
and  would  be  better  invested  in  some  new  hard¬ 
ware,  or  a  software  tool  that  promises  to  replace 
a  few  programmers. 

Ail  of  have  been  hypnotized  by  a  running 
sales  pitch  consisting  of  fallacious  "raw  econo¬ 
mics"  and  illusory  "real  worlds".  We  have  spent 
billions  for  "tools",  but  not  pennies  on  under¬ 
standing  what  is  needed  to  create  the  professional 
technical  leaders  who  will  actually  use  them. 

We’ve  spent  millions  on  "schooling",  but  skimped 
on  real  learning.  As  a  result,  our  tools  lie  on 
the  shelf,  misunderstood  and  little  used.  Our 
systems  seem  to  cost  too  much,  but  conference 
after  conference  merely  repeats  the  sales  pitch 
of  its  predecessor — buy  more  things'.  Our  systems 
fail  to  satisfy,  but  all  we  hear  is  that  people 
don’t  understand  the  finicky  nature  of  computers — 
the  next  generation  will  solve  all  that! 

The  next  "generation"  will  come  when  we  out¬ 
grow  our  adolescent  fascination  with  toys  and 
develop  an  adult  interest  in  people.  Then  we 
will  begin,  as  other  technologies  have  done,  to 
master  the  social  and  psychological  forces  that 
are  the  real  power  behind  successful  technology — 
and  the  real  reason  for  technology  in  the  first 
place.  An  excellent  starting  point  would  be  to 
take  computer  training  out  of  the  hands  of  com¬ 
puters.  And  perhaps  put  it  in  the  brains  of 
people. 
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Abstract 

This  paper  reports  on  the  progress  made  in 
dynamic  phenomenological  modeling  of  the  software 
life  cycle.  An  overview  rationale  is  presented  to 
demonstrate  the  need  for  a  dynamic  life  cycle 
model.  The  software  life  cycle  behaves  like  a  nar¬ 
row  band  gaussian  process.  The  linkage  to  informa¬ 
tion  theory  is  suggested  and  the  concept  that  soft¬ 
ware  systems  have  a  characteristic  bandwidth  and 
behave  like  bandpass  filters  is  used  to  quantify 
Brooks'  Law  and  show  why  managers  have  little  flex¬ 
ibility  in  specifying  the  development  time  of  a 
system.  Productivity  varies  inversely  as  the 
square  root  of  the  average  applied  manpower,  hence 
managerial  efforts  to  speed  up  projects,  increase 
productivity  and  cut  costs  are  non-productive  be¬ 
cause  these  measures  are  functionally  related  in  a 
counterintuitive  way.  An  example  is  presented  to 
show  how  the  Norden/Rayleigh  model  is  used  to  gen¬ 
erate  quantitative  answers  to  the  management 
questions:  Can  I  do  it?  How  much  will  it  cost? 

How  long?  How  many  people?  What's  the  risk? 

What's  the  trade-off? 

I .  Introduction 

To  do  meaningful  tasks  in  the  functional  areas 
of  modern  government  and  business  entities,  most 
large  scale  software  takes  2-3  years  to  develop,  and 
has  an  operational  life  of  6  to  10  years  before  being 
replaced.  This  is  a  life  cycle.  A  model  of  this 
life  cycle  has  been  developed  and  can  be  used  to 
forecast  and  manage  the  costs,  schedules  and  man¬ 
loading  requirements  of  large  software  projects. 

Time  is  the  independent  variable  whether  explicify 
treated  or  not  (careful  attention  to  management 
thinking,  guidance,  etc.,  will  show  that  time  is 
always  implicitly  recognized  and  is  central  to  their 
planning  and  thinking--al 1  budgets,  plans,  and  con¬ 
tracts  have  a  time  base). 

Managers  ask  these  questions  of  software  pro¬ 
jects  just  as  they  do  of  any  other  process  that 
consumes  resources: 

How  much  will  it  cost? 

How  long  will  it  take? 

Because  the  answers  to  these  questions  will  go 
into  a  resource  allocation  plan,  or  budget,  a  tire 


base  is  implied  and  the  manager  is  really  asking  for 
much  more.  Typically,  he  really  wants  these  answers: 


Can  I  do  it? 

How  much  will  it  cost? 

How  long  will  it  take? 

How  many  people  at  any  time? 

What  kind  of  skills? 

What  are  the  risks? 

Are  there  trade-offs?  What  are  they? 

How  do  the  constraints  affect  these  answers? 

These  are  the  questions  we  can  now  answer. 

In  Section  II  a  broad  overview  of  the  software 
cycle  is  provided.  Section  III  shows  how  to  apply 
this  model  to  a  real  world  software  project. 

Answers  to  the  management  questions  are  determined 
quantitatively  and  presented  in  a  form  directly 
usable  by  financial  and  project  managers.  To  pro¬ 
vide  a  framework  for  this  work,  the  fundamental 
aspects  of  software  systems  development  and  its 
interrelationship  with  physics  and  information 
theory  is  presented  in  Section  IV.  Finally,  sum¬ 
mary  tables  are  presented  in  the  last  section  for 
quick  reference  to  the  techniques  necessary  for 
forecasting  costs  and  schedules. 

1 1 .  Overview  of  the  Software  Life  Cycle 

Many  industrial  processes  are  linear  or  nearly 
linear.  This  means  that 

Quantity  =  rate  x  time 

where  rate  is  a  constant. 

In  a  people-intensive  activity  it  means 
Effort  =  manpower  x  time 
Cost  =  Cost/unit  effort  x  effort 
where  manpower  (a  rate)  is  assumed  constant  and 


cost/unit  effort  is  a  standard  cost,  or  a  time 
average . 

A  further  assumption  is  that  the  production 
rate  (number  of  "widgets "/day )  is  directly  propor¬ 
tional  to  manpower  (et tort/day) ,  or  since  the  time 
base  is  the  same. 

Effort  Product  (no.  of  items) 

This  doesn't  work  for  software  because  software 
production  rates  are  not  linear  with  time  and  the 
effort  is  not  directly  (linearly)  proportional  to 
product. 

Sometimes  mildly  non-linear  rates  can  be  re¬ 
placed  by  averages  over  time  (standard  labor  cost¬ 
ing  is  an  example). 

If  the  rates  are  continuously  varying  then  the 
solution  lies  in  the  use  of  the  calculus.  This  is 
a  scary  word  because  many  people  have  not  studied 
it  and  have  heard  that  it  is  vague,  abstract,  hard 
to  apply  and  only  long  hairs  use  it  to  prove  obscure 
scientific  points  of  no  practical  consequence. 

Vie  have  to  accept  that  built-in  bias,  solve 
the  problem  using  calculus  and  then  present  the  re¬ 
sults  in  a  tabular  and  graphical  form  which  can  re¬ 
late  the  curvilinear  relationships  in  a  form  pala¬ 
table  to  most  planners  and  decision  makers. 

Now,  if  our  problem  is  not  only  curvilinear  in 
its  fundamental  relationships  but  also  possesses  a 
random  character,  this  means  the  probability  and 
statistical  laws  come  into  play,  it  means  we  cannot 
measure  rates,  effort,  time,  and  product  with  great 
precision  because  these  quantities  are  always  fluc¬ 
tuating  about  some  average  (and  that  average  may  be 
changing  over  time),  so  what  we  have  to  do  is  work 
with  averages  of  the  quantities  and  a  measure  of 
the  variability  of  the  ian t i t i es  (standard  devia¬ 
tion).  This  is  an  important  philosophical  point 
because  it  means  that  only  a  certain  level  of  accu¬ 
racy  and  precision  is  possible  and  all  efforts  to 
do  better  are  futile.  It  further  means  that  the 
statistics  improve  precision  with  higher  and  higher 
levels  of  aggregation  and  get  worse  with  increasing 
levels  of  disaggregation.  The  implication  of  this 
is  that  precise,  detailed,  work  breakdown  struc¬ 
tures  are  highly  unlikely  to  be  accurate  or  mean¬ 
ingful  in  a  quantitative  way. 

Given  this  background,  the  software  estimating 
and  control  problem  is  really  a  different  problem 
depending  on  where  one  is  in  the  software  life 
cycle. 

During  the  feasibility  and  functional  design 
phases,  it  is  a  pure  estimation  problem  using 
phenomenology  and  past  experience  (data)  to  fore¬ 
cast  a  time  varying  future  event.  A  model  of  the 
observed  behavior  is  appropriate.  This  model  should 
be  a  time  varying  model  and  should  have  parameters 
that  relate  directly  to  the  management  questions. 

The  Rayl eigh/Norden  model  Putnam  has  chosen  meets 
these  criteria. 

A  life  cycle  forecast  is  an  intelligent  "guess" 


of  how  a  process  will  behave  over  an  8  or  10  year 
period  based  on  the  information  we  have  now.  Hut 
our  input  information  about  the  system  we  plan  to 
build,  or  are  building,  is  continually  changinq  for 
a  variety  of  reasons.  Requirements  change  because 
of  external  factors  (regulatory;  the  way  we  do  busi 
ness,  etc. ) . 

This  requirements  chanqe  process  means  that 
our  estimate  done  a  short  time  ago  becomes  obsolete 
and  needs  revision  to  reflect  today's  reality. 

This  can  be  done  by  just  re-estimating  using 
the  new  input  information  if  the  development  has 
not  yet  started.  However,  if  development  is  under 
way  the  problem  is  different.  We  have  to  adapt  on 
the  fly.  We  have  to  assess  what  an  incremental 
change  applied  at  a  specific  time  will  have  on  the 
process  at  all  future  times.  Clearly,  this  is  a 
dynamic  modeling  requirement.  Again  we  want  the 
answers  expressed  in  the  form  of  answers  to  the 
management  questions  so  that  managers  can  reprogram 
resource  allocation  and  assess  risks  in  terms  of 
contractual  obligations. 

Now  a  static  model  that  does  not  adapt  to  the 
real  data  coming  from  the  actual  project  can  only 
do  this  in  a  very  imperfect  and  superficial  way. 

All  the  current  models  (Doty,  PRIC.F-S,  IBM,  GRC, 
TRW)  except  one  (Putnam)  are  static  -  they  do  not 
treat  time  explicitly  and  they  do  not  haye  the  ca¬ 
pability  to  adapt  to  the  actual  behavior  of  the 
system  at  any  instant  of  time  in  that  8-10  vear 
life  cycle  period. 

The  best  data  you  will  ever  have  is  that  com¬ 
ing  from  the  project  you  •••■  inn  .  n.  A  care¬ 
ful  fitting  of  that  data  to  a  mode’  that  is  faith¬ 
ful  to  the  phenomenology  involved  will  give  the 
best  possible  answer  and,  moreover,  will  continual¬ 
ly  adapt  to  the  dynamics  of  the  chanqe  process. 

This  means  that  it  is  alwavs  converging  toward  the 
true  answer  which,  alas,  is  alwavs  unknown  (exactl  v) 
until  the  process  is  over.  But  the  convergent  pro¬ 
perty  is  hiqhly  useful  because  it  means  we  are  al¬ 
ways  getting  closer  and  closer  to  the  true  behavior 
even  though  we  can  only  determine  the  average  be¬ 
havior  and  the  st.atistiral  uncertainty  (due  to  the 
random  character)  at  any  instant. 

But  the  statistical  uncertainty  is  useful,  it 
lets  us  control  the  process.  If  we  can  estimate 
that  one  year  from  now  a  system  should  require  ?‘j 
people  with  a  statistical  uncertainty  ('  1  '1  of  S 
people,  and  we  find  when  we  get  to  that  point  that 
AO  people  are  working  on  the  system,  then  higher 
management  will  know  that  something  signif  leant  i.as 
happened  due  to  a  real  cause  because  therp  is  less 
than  1  chance  in  10(1  that  such  a  large  excursion 
from  the  predicted  would  occur  because  of  pure 
chance  (random  fluctuation). 

This  is  academic  for  the  1st  line  project 
manager;  he  knew  when  he  had  to  add  people,  or 
shift  resources.  It  may  not  be  apparent  to  finan¬ 
cial  and  budgetary  officials  well  removed  from 
daily  contact  who  review  the  project  only  periodi¬ 
cally  from  records  and  reports. 
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On  the  other  hand  an  excursion  of  '  3  or  4 
people  would  be  normal  -  inherent  in  the  random 
character  of  the  process  -  no  cause  for  concern. 

So  an  adaptive  model  is  necessary  for  dynamic 
update  to  tell  us  where  we  are  and  where  we  appear 
to  be  heading  based  on  all  information  available  now. 
The  Rayleigh/Norden  model  does  this  by  continually 
fitting  each  piece  of  new  information  to  revise  the 
most  recent  estimate.  The  technical  term  for  this 
is  "adaptive  filtering".  It  is  really  a  real  time 
process  controller  that  gives  the  optimal  future 
resource  allocation  (consumption)  pattern  at  any 
instant  in  time.  The  Box  (13,15)  fitting  technique 
is  the  implementation  of  the  adaptive  filtering  con¬ 
cept. 

The  requirements  change  process  is  modelled 
using  the  second  order  Rayleigh  differential  equa¬ 
tion.  This  cfn  be  made  to  respond  in  real  time  be¬ 
cause  it  can  be  solved  step-wise  in  discrete  fashion 
using  the  very  general  Runge-Kutta  numerical  solu¬ 
tion  to  differential  equations.  The  theory  is  a 
little  deep,  but  the  implementation  is  very  straight 
forward  on  a  calculator  or  computer  because  all  one 
has  to  know  is: 

-  the  actual  elapsed  time,  t 

-  the  actual  manpower  on  board,  y 

-  the  cumulative  effort  expended  so 
far,  y 

-  the  best  estimate  of  the  difficulty 

2 

from  previous  experience,  K/t^  , 

where  K  is  the  life  cycle  effort  and 
td  is  the  development  tine, 

-  and  an  estimate  of  how  much  the  diffi¬ 
culty  is  likely  to  change  as  a  result  of  a  require¬ 
ments  change.  Linear  thinking  is  reasonably  valid 
here.  If  about  25  of  the  application  programs 
are  affected  by  a  requirements  change  then  there 
will  be  about  a  +25  change  in  the  difficulty. 

y  +  t/t/  y  +  1/f/y  =  D  (1  +  .25) 

=  K/td2  (1  +  .25) 

is  the  form  we  used  to  study  the  impact  of  the  re¬ 
quirements  change  process.  We  just  project  ahead 
at  t  +  '  t,  t  +  2At,  t  +  3 At,  etc.,  and  see  how  y, 
y  compare  with  the  earlier  estimates  made  prior  to 
the  impact  of  the  requirements  change  process.  The 
deference  will  be  the  incremental  manpower  and 
effort  required  to  accommodate  the  change. 

Trackinq  and  fitting  throughout  the  life  cycle 
is  important  because  60  of  the  life  cycle  effort 
qoes  on  in  the  operations  and  maintenance  phase. 

If  this  is  treated  as  a  level-of-effort  task,  then 
far  more  resources  than  are  necessary  are  used. 

This  inhibits  future  development  capability  given 
budget  constraints.  Moreover,  some  70-80'  of  total 
software  work  is  in  the  operations  and  maintenance 
phase  so  unless  this  work  is  optimally  controlled, 
then  the  projection  of  Harlan  Mills*  will  become 


true  and  there  will  be  no  capability  to  do  new  de¬ 
velopment  work  if  the  software  house  has  to  work 
with  a  fixed  manpower  or  budget  constraint  (very 
common  in  government). 

Given  this  background,  let’s  examine  how  w<- 
use  the  Rayleigh/Norden  equation  to  obtain  engineer¬ 
ing  quulity  answers  during  the  early  specification 
and  functional  development  phases  of  a  software  pro¬ 
ject.  The  following  example  will  illustrate  simple 
applications  of  several  powerful  techniques. 

111.  fxample  of  an  Early  Sizing,  Cost  and  Schedule 

Estimate  for  an  Application  Software  System 

Software  development  has  been  characterized  by 
severe  cost  overruns,  schedule  slippages  and  an  in¬ 
ability  to  size,  cost  and  determine  the  development 
time  early  in  the  feasibility  and  functional  design 
phases  when  investment  decision  must  be  made.  Mana¬ 
gers  want  answers  to  the  following  questions:  Can 
1  do  it?  How  much  will  it  cost?  How  long  will  it 
take?  How  many  people?  What's  the  risk?  What's 
the  trade-off?  This  portion  of  the  paper  shows 
how  to  size  the  project  in  source  statements  (S  ), 

how  to  relate  the  size  to  the  management  parameters 
(life  cycle  effort  (K)  and  development  time  (t^)! 

and  the  state-of-technology  (Ck)  being  applied  to 
the  problem  through  the  software  equation, 

SR  =  t^2^.  The  software  equation  is  then 

solved  using  a  constraint  relationship  K  -  v  i)  •  t^^ , 

where  j'.'D  !  is  the  magnitude  of  the  difficulty  gra¬ 
dient  empirically  found  to  be  related  to  system 
development  characteristics  measuring  the  degree 
of  concurrency  of  major  task  accomplishment.  Monte 
Carlo  simulation  is  used  to  generate  statistics  on 
variability  of  the  effort  and  development  time.  Thp 
standard  deviations  are  used  to  make  risk  profiles, 
finally,  having  the  effort  and  development  time 
parameters,  the  Rayleigh/Norden  equation  is  used  to 
generate  the  manpower  and  cash  flow  rate  at  any 
point  in  the  life  cycle.  The  results  obtained  de¬ 
monstrate  that  engineering  quality  quantitative 
answers  to  the  management  questions  can  he  obtained 
in  time  for  effective  management  decision  making. 

Backgroundand  Approach 

Over  the  past  four  years  the  author  has  studied 
the  manpower  vs  time  pattern  of  several  hundred 
medium  to  large  scale  software  development  projects 
of  different  classes.  These  projects  all  exhibit  a 
similar  life  cycle  pattern  of  behavior  -  a  rise  in 
manpwer,  a  peaking  and  a  tailing  off.  Many  of  these 

■k 

Mills  projected  saturation  with  maintenance  work 
a  few  years  ahead  leaving  no  capability  to  do  new 
work  unless  there  was  an  ever  expanding  software 
work  force.  Mills  implicit  assumption  was  level-of- 
effort  on  existing  systems  and  no  death  process. 

The  Rayleigh  equation  automatically  accounts  for  ob¬ 
solescence  a  id  death  and  hence  provides  for  new  de¬ 
velopment  capability.  This  is  what  actually  happens 
and  is  demonstrated  by  the  data. 


projects  (and  all  the  large  ones)  follow  a  time  pat¬ 
tern  described  by  the  life  cycle  curves  of  Norden 
(7,8)  which  are  of  the  general  Weibull  class  and 
more  specifically  the  Rayleigh  form, 

2  2 

y  -  K/t  2  .  t  .  e  -t  /2tj  ,  where  >  is  the  manpower 

at  any  time  t;  K  is  the  area  under  the  curve  and  is 
the  nominal  life  cycle  effort  in  manyears;  trf  is 

the  time  of  peak  manpower  in  years  and  corresponds 
very  closely  to  the  development  time  for  the  system. 

Even  though  large  systems  seem  to  follow  this 
general  pattern,  some  small  systems  do  not.  They 
seem  to  have  a  more  rectangular  manpower  pattern. 

The  reason  for  this  is  that  the  applied  manpower 
pattern  is  determined  by  management  and  by  contrac¬ 
tual  agreements.  Many  small  projects  are  estab¬ 
lished  as  level -of -effort,  contracts  -  hence  rec¬ 
tangular  manloading.  For  large  projects  this  is 
generally  inadequate  because  managers  have  a  poor 
intuitive  feel  for  the  resources  to  do  the  job.  Ac¬ 
cordingly,  they  tend  to  respond  to  the  needs  of  the 
system  reactively.  This  results  in  time  lags  and 
underapplication  of  effort  at  some  instant  in  time, 
but  the  effect  is  a  reasonably  close  approximation 
to  Rayleigh  manloading. 

The  author  has  shown  in  earlier  works  (9-14) 
that  there  is  a  Rayleigh  law  at  work.  It  is  the 
1st  subcycle  of  the  overall  development  curve 
called  the  design  and  coding  curve  (detailed  logic 
design  and  coding).  This  is  also  a  manpower  curve 
that  is  proportional  to  the  analyst  and  programmer 
manpower  -  the  direct  productive  manpower.  This 

curve  is  denoted  j'j.  Its  form  is 
2 

y,  =  K/td2  t  e  '3t  /td2  (MY/YR)  when  related  to  the 
original  definition  of  K  and  td  for  the  overall 

burdened  life  cycle  curve.  When  this  curve  is 
multiplied  by  the  average  productivity  (PR)  for  the 
project  it  yields  the  rate  of  code  production. 

dS  •  — 

s^  =  S$  =  2.49  PR  y^,  where  the  2.49  is 
dt 

necessary  to  account  for  the  definition  of  produc¬ 
tivity  as  a  burdened  number  ( i . e . ,  includes  over¬ 
head  and  support  activities).  Now  the  time  inte¬ 
gral  of  the  rate  of  code  production  yields  the  to¬ 
tal  number  of  source  statements, 

Ss  =  :  ds.  dt  =  PR  2.49 /y,  dt 
'  0  dt  «  1 

S$  -  PR  .  2.49.K/6. 

The  author  has  found  that  tne  PR  is  related  to 
the  Rayleigh  parameters  K  and  td  in  the  following 

manner  (14): 

PR  -  Cn  (K/^2)"2^3  where  the  term  K/tj2  has 
been  defined  as  the  system  difficulty  in  terms  of 


effort  (K)  and  time  (td)  to  produce  it  and  Cn  is  a 

quantized  constant  defining  a  family  of  such  curves, 
is  a  channel  capacity  measure  in  the  information 

theory  sense,  but  in  a  more  practical  sense,  it  seems 
to  be  a  measure  of  the  state-of-technology  being 
apptied  to  a  particular  class  of  system. 

Substituting  for  PR,  we  obtain  the  software 
equation: 

Ss  =  2.49  Cn(K/td2)  '2/3  K/6 

S  .  l-*±  c  y  k-2/3  t  4/3 
son  d 

Ss  =  CK  K^3  t^3 ,  where  CK  has  now 

2  49 

subsumed  -j —  C  . 

b  n 

Having  this  expression  which  now  relates  the 
product  in  source  statements  to  the  Rayleigh  man¬ 
power  parameters  (which  are  also  the  management 
parameters),  we  turn  to  a  practical  way  in  which  to 
estimate  the  size  ( Ss ) ,  effort  (K)  and  development 

time  (td)  of  a  software  project  early  in  the  re¬ 
quirements  and  specification  phase  of  the  project. 
This  will  let  us  answer  the  management  questions 
necessary  for  effective  investment  decisions  for  the 
software  project. 

We  will  do  this  in  the  form  of  a  case  history 
for  a  project  we  will  call  SAVE.  First,  we  will 
show  a  way  to  obtain  a  good  estimate  of  the  number 
of  source  statements.  We'll  plot  the  software  equa¬ 
tion  and  establish  a  feasible  region  for  our  develop¬ 
ment  time  parameters,  we  will  impose  a  constraint 
relation  involving  K.  and  td-  We  will  do  a  Monte 

Carlo  simulation  to  generate  variances  for  K  and 
td.  With  these  numbers  in  hand,  we  can  then  do  a 

trade-off  analysis,  pick  a  reasonable  effort  (cost) 
time  combination  and  complete  our  translation  into 
quantitative  answers  to  the  management  questions. 

The  answers  we  obtained  will  be  close  to  optimal 
for  the  given  constraint  and,  moreover,  we  will 
automatically  have  a  sensitivity  and  risk  profile. 

Initial  Sizing 

Given  the  broad,  preliminary  design  of  SAVE 
consisting  of  the  processing  flow  of  the  major 
functions  and  the  estimates  by  the  designers  of  the 
size  range  of  the  major  functions,  we  can  make  a 
preliminary  estimate  of  the  development  time,  devel¬ 
opment  effort  and  development  cost  to  build  the  sys¬ 
tem.  (See  Figure  1.) 

The  input  data  from  the  project  team  are  in  the 
form  of  size  ranges  for  each  major  function.  Three 
or  four  team  members  estimated  the  size  of  each 
function  as  follows: 

-  Smallest  possible  size  (in  source  statements)  -  a 

-  Most  likely  size  -  m 
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-  Largest  possible  size  -  b 

These  were  averaged  for  each  function  and  resulted 
in  the  first  3  columns  of  Table  1.  This  was  in  ef¬ 
fect  a  Delphi  polling  of  experts  and  their  consensus. 
(Having  done  this  with  several  groups  of  systems  en¬ 
gineers,  it  is  interesting  to  note  that  they  are 
very  comfortable  with  this  procedure.) 

Note  that  this  results  in  a  broad  range  of 
possible  sizes  for  each  function  and  that  the  dis¬ 
tribution  is  skewed  on  the  high  side  in  most  cases. 
This  is  typical  of  the  Beta  distribution,  thechar- 
acteristics  of  which  are  used  in  PERT  estimating. 

We  adopt  the  PERT  technique  to  get  an  overall  sys¬ 
tem  size  range  and  distribution. 

1.  An  estimate  of  the  expected  value  of  a 
Beta  distribution  is: 

r  a  +  4m  +  b 
Li  =  6 

The  overall  expected  value  is  just  the  sum  of  the 
individual  expected  values. 

N 

E  =  E.  . 

l 

i  =  l 

This  is  ihe  sum  of  the  fourth  column  of  Table  1 
(98475  Ss). 

2.  An  estimate  of  the  standard  deviation  of 
any  distribution  (including  Beta)  is  the  rangewith- 
in  which  99  of  the  values  are  likely  to  occur  di¬ 
vided  by  6,  i .e. , 

o.  =  I  b-a  I  /6. 

The  overall  standard  deviation  is  the  square  root 
of  the  sum  of  the  squares  of  the  individual  stan¬ 
dard  deviations,  i.e.. 


This  results  in  a  much  smaller  standard  deviation 
than  one  would  "guess"  by  just  looking  at  the  in¬ 
dividual  ranges:  the  reason  is  that  some  actuals 
will  be  lower  than  expected  (E^ );  others  will  be 

higher.  The  effects  of  these  variations  tend  to 
cancel  each  other  to  some  extent.  This  cancelling 
effect  is  best  represented  by  the  root  of  the  sum 
of  squares  criterion. 

The  result  is 

E  =  98475  source  statements 

-  *  7081  source  statements 

tot 

and  the  99  range  is  77,000  -  120,000  Ss,  or  we  are 

99'.'.  sure  that  the  ultimate  size  will  be  in  this 
range  if  the  input  estimates  do  not  change.  Of 


course,  if  the  input  estimates  change,  we  should  re¬ 
do  our  calculations  and  revise  the  results  accord¬ 
ingly. 

Development  Time-Effort  Determination 

Table  2  is  a  result  of  using  the  software  equa¬ 
tion  which  relates  the  product  in  source  statements 
to  the  effort,  development  time  and  state-of-tech- 
nolo§y  being  applied  to  the  project.  The  equation 
is  derived  partly  from  theory  and  partly  from  an 
empirical  fit  of  a  substantial  body  of  productivity 
data.  The  form  of  the  equation  is: 


where  S$  is  the  number  of  end  product  delivered 
source  lines  of  code,  an  information  measure. 

is  a  state-of-tcchnology  constant.  For  the 

environment  anticipated  for  SAVE  this  constant  is 
10040.  can  be  determined  by  calibration  against 

the  software  equation  using  data  from  projects  de¬ 
veloped  by  the  same  software  house  using  similar 
technology  and  methods. 

K  is  the  life  cycle  effort  in  man  years.  This 
is  directly  proportional  to  development  effort 

(Dev  Effort  =  . 4K )  and  cost  (S/MY  .  K  =  SLC  cost; 
57MY  .  ( . 4 K)  =  S  Dev). 

td  is  the  development  time  in  years.  This 
corresponds  very  closely  to  customer  turnover. 

Figure  2  shows  a  parametric  graph  of  this  equa¬ 
tion. 

Table  2  presents  three  scenarios  for  5  differ¬ 
ent  points  in  the  size  distribution  curve.  The  ex¬ 
pected  case  is  given  in  the  row  labelled  E.  The 
column  under  trf  -  2  years  gives  a  nominal  develop¬ 
ment  effort  of  23.59  man  years,  S1.18M  cost 
(@  c50, 000/MY)  to  do  98475  source  statements. 

The  fastest  (or  minimum)  possible  time  for 
98475  source  statements  is  1.81  years.  The  corres¬ 
ponding  development  effort  is  35.4  MY,  and  cost  of 
$1.77  million.  The  assumption  here  is  that  the 
system  is  a  stand  alone  and  the  gradient  condition 
of  |VD|  -  15  cannot  be  exceeded. 

The  risk  biased  column  is  based  on  deliberately 
adding  time  (.4  of  a  year)  to  the  minimum  time  to 
increase  the  probability  of  being  able  to  deliver 
the  product  at  the  contract  specified  date.  This 
biasing  is  to  allow  for  external  factors  such  as 
late  delivery  of  a  computer,  an  average  number  of 
requirements  changes  during  development,  etc.  In 
the  case  of  98,475  source  statements,  this  would  be 
1.81  +  .4  =  2.21  years.  The  corresponding  expected 
development  effort  is  15.91  MY;  $.8  million  cost. 
Note  that  development  effort  and  cost  go  down  as 
time  to  do  the  job  is  increased.  This  is  Brooks1 
law  at  play.  Conversely,  there  is  no  free  lunch-- 


Table  1. 


Major  Function 

Ss 

Least 

a 

Ss 

Most  Likely 

m 

S 

s 

Most 

b 

Expected 

Ss 

Standard 

Deviation 

Ss 

Maintain 

3675 

13375 

18625 

13467 

1658 

Search 

5577 

8988 

13125 

9109 

1258 

Route 

3160 

3892 

3300 

4588 

940 

Status 

350 

1425 

2925 

1579 

346 

Browse 

1875 

4052 

8250 

4389 

1063 

Print 

1437 

2455 

6125 

2897 

781 

User  Aids 

6875 

10625 

16250 

10938 

1563 

Incoming  Msg 

5830 

8962 

17750 

9905 

1987 

Sys  Monitor 

9375 

14625 

28000 

15979 

3104 

Sys  Mgt 

6300 

13700 

36250 

16225 

4992 

Comm  Proc . 

5375 

8975 

14625 

9400 

1458 

98475 

Table  2.  SAVE 


Assumption:  On-line,  interactive  development. 

Top  down  structured  programming, 

HOL,  Contemporary  development 
environment.  No  machine  constraints. 
=  10040;  Standalone  System  --  |\  D|  =15 


td  =  2  yrs 

Fastest  | 

Risk  Biased 

Dev  Effort 
(MV) 

ld 

Dev  Effort 
(MV) 

td  +  *r 

Oev  Effort 
(MV) 

11.28 
( $ . 564M ) 

1 .63 

25.80 
($1 .29M) 

2.03 

10.71 

($.55M) 

18.86 

($.943) 

1.75 

32.16 

($1.61) 

2.15 

14.12 

($.71M) 

23.59 
($1 . 180M) 

1.81 

35.40 

($1.77M) 

2.21 

15.91 
( $ . 7  96M 

29.05 
($1 .45M) 

1 .86 

38.71 

($1.84M) 

2.26 

17.77 

($.89M) 

42.69 

(S2.135M) 

1.97 

45.65 
($2 . 28M) 

2.37 

21 .77 
($1 .09M 
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if  time  is  shortened  the  cost  goes  up,  dramatically . 


This  can  be  illustrated  by  obtaining  the  trade¬ 
off  law  from  the  software  equation.  Solve  the  soft¬ 
ware  equation  for  K: 


.1/3  .4/3  , 

K  ld  '  Ck 


(Ktd4) 


1/3 


The  results  of  the  simulation  are  given  in  the 
next  table.  Notice  that  the  simulated  estimated 
development  effort  is  the  same  as  the  expected  de¬ 
terministic  value  and  the  development  time  is  also 
the  same.  This  is  as  it  should  be.  The  simulation 
produces  the  right  expected  values.  The  real  value 
in  the  simulation  is  that  it  produces  a  measure  of 
the  variation  in  effort  and  in  development  time 
which  we  can  used  to  construct  risk  profiles. 


This  is  the  trade-off  law.  In  terms  of  development 
effort,  E  =  .4K  so 


In  our  specific  case 


and  we  can  trade-off  between  2  years  (contract  con¬ 
straint,  say)  and  1.81  years--the  minimum  time  for 
our  gradient  constraint. 

Parameter  Determination  By  Simulation 

While  Table  2  gives  a  fairly  broad  range  of 
solutions  that  answer  many  "what  if"  questions,  it 
is  an  essentially  deterministic  solution;  that  is, 
it  assumes  we  know  the  input  information  exactly. 

Of  course,  we  don't. 

A  better  solution,  then,  is  one  in  which  we 
treat  the  uncertainties  in  our  input  information 
in  obtaining  our  solution.  This  is  generally  not 
feasible  analytically,  but  is  nicely  handled  by 
Monte  Carlo  simulation.  In  our  case  we  do  this  by 
letting  the  input  number  of  S?  vary  randomly  about 

the  expected  value  (98,475)  according  to  our  com¬ 
puted  standard  deviation,  =  7081,  and  letting 

ss 

the  stand-alone  gradient  (|VD|  =  15)  vary  within 
the  statistical  uncertainty  of  its  measured  (com¬ 
puted)  value  ( =  2). 

We  then  run  the  problem  on  the  computer 
several  thousand  times  with  these  random  varia¬ 
tions  in  parameters  and  generate  the  statistics 
of  the  variation  in  our  answer.  This  is  a  much 
better  measure  of  what  is  likely  to  happen  as  a 
result  of  the  uncertainties  in  the  problem. 


Major  Milestone  Determination 


The  results  of  the  simulation  determination  of 
the  development  time  are  used  to  generate  the  major 
milestones  of  the  project. 


These  milestones  relate  to  the  coupling  of  sub¬ 
cycles  of  the  life  cycle  to  the  overall  project 
curve  (9-14).  Examination  of  several  hundred  sys¬ 
tems  shows  this  coupling  is  very  stable  and  predict¬ 
able.  The  empirical  milestones  resulting  from  these 
earlier  studies  shows  the  following  scaling. 

Event 

Milestone  Fraction  of 
Development  Time,  t^ 

Critical  Design  Review 

.43 

Systems  Integration  Test 

.67 

Prototype  Test 

.80 

Start  Installation 

.93 

Full  Operation  Capability 

1.0 

Table  4  converts  this  to  the  appropriate  des¬ 
criptors  and  actual  time  schedule  for  this  project. 


Risk  Analysis 

The  results  of  the  SAVE  simulation  for  develop¬ 
ment  time,  development  effort  and  development  cost 
can  be  shown  in  the  form  of  probability  plots.  As¬ 
suming  a  normal  (gaussian)  distribution,  all  that  is 
necessary  is  an  estimate  of  the  expected  value 
(plotted  at  50y  level)  and  the  standard  deviation 
(plotted  offset  from  the  expected  value  at  the  16 
probability  level)  to  generate  the  line.  Then  one 
can  determine  the  probability  of  any  value  of  the 
quantity  in  question.  For  ease  of  presentation ,  the 
plots  are  summarized  in  Table  5.  For  example,  there 
is  a  90*  probability  that  the  software  development 
will  not  take  more  than  40  MY  of  effort.  There  is 
a  99  probability  it  will  not  take  more  than  45  MY 
of  development  effort.  There  is  only  a  10  proba¬ 
bility  it  will  take  less  than  30  MY  of  Development 
effort. 

The  result  for  the  development  time  is  extreme¬ 
ly  important  from  a  conceptual  point  of  view.  The 
small  standard  deviation  is  both  a  curse  and  a 
blessing.  It  says  we  can  determine  the  development 
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Table  3. 

SAVE  SIMULATION 

i  INPUT  TO  SIMULATION 

;  S  =  98475, 

i 

:  .0  -  15, 

;  Technology  Constant 

!  RESULTS  OF  SIMULATION 

Number  of  iterations  =  2170 

i 

i 

|  Expected  Development  Time  -  1 .81  years 

Standard  Deviation,  Development  Time  =  .063  years 
Expected  Development  Effort  =  35.1  Manyears 
Standard  Deviation,  Development  Effort  =  3.77  Manyears 


r  s  -  7081 


VD 

10040 


Table  4. 

SAVE  MILESTONES 
trf  -  1 .81  years 


EVENT 

t/t . 

d  ; 

time  from  start 
(years) 

time  from  s 
(months ) 

CDR 

.43  ; 

j 

' 

.79  i 

i 

9 

Software  j 

S.I.T. 

.67 

1 .21 

1 

15 

Hardware 

S.I.T. 

.80 

1.44 

| 

17 

Start 

Installation 

.93 

1.68 

i 

20 

Start 

Acceptance 

Test 

i  1 

1.0 

i 

!  ; 

22 

Complete 

Acceptance 

Test 

1 .14 

i 

i 

■  2.06 

1 

25 
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time  very  accurately 


^  't  ^d  3.5'  )  but  at  the 


same  time  it  tells  us  we  have  little  latitude  in 
adjusting  the  development  time  to  meet  contractual 
requirements. 


For  example,  ■’  =  .063  years  is  .063  (52)  - 

cd 

*  3.28  weeks;  3r  =  3  (3.28)  =  +  y. S3  weeks; 
ld 

=  +  9.83  *  10  weeks 


So,  if  we  add  3o  to  td  we  will  be  99  sure 
that  tj  will  not  exceed  the  actual  value  from  ran¬ 
dom  causes.  This  does  not  mean  that  requirements 
changes  or  late  delivery  of  a  computer  will  still 
permit  the  software  to  come  in  at  ♦  10  weeks  of  the 

expected  time.  These  are  external  factors  that  will 
change  td  and  must  be  specifically  accounted  for. 

This  is  the  curse.  The  system  is  very  sensitive 
to  external  perturbations  and  these  will  generally 
cause  development  time  increments  greater  than  2  or 
3  ■  (a  90  day  delay  in  test  bed  computer  delivery, 
a 

say) . 


But,  knowir.q  this  great  time  sensitivity,  man¬ 
agement  can  use  it  effectively  in  planning  and  con¬ 
tracting  so  that  risk  is  always  acceptable.  The 
major  point  is;  time  is  not  a  free  good.  Develop¬ 
ment  time  cannot  be  specified  by  management.  The 
system  determines  that  (i.e.,  software  systems  are 
inherently  narrow  band  processes  with  sharp  cutoff 
characteristics,  a  point  that  will  be  elaborated  on 
later) . 

Manpower  and  Lash  Flow  Pattern 

Now  that  we  have  the  parameters  for  develop¬ 
ment  effort  and  development  time  we  can  generate  the 
manloading  and  cash  flow  pattern  for  the  software 
development  period  (and  even  the  life  cycle,  if  we 
choose).  The  Rayleigh/Norden  equation  gives  the 
instantaneous  manpower. 

2 

y  =  K/td2  .  t  .  e  _t  /2t/  MY/YR 

K  »  E/.4  =  35.1/. 4  =  87.75  MY 

td  =  I. 81  years 


so 

,  .  t  .  exp  (-t2/2(1.81)2)  MY/YR 


for  the  software  development  effort  (Phase  11).  The 
cash  flow  is  just  the  average  dollar  cost/MY  times 

y. 


Cash  Flow  Phase  11  =  5?MY  .  y  S/YR 

Table  6  combines  the  software  development  ef¬ 
fort  (Phase  II)  with  the  initial  design  and  system 
specification  (Phase  II)  overlap  and  the  hardware 
integration  and  test  effort.  The  column  labelled 
total  adds  the  separate  efforts  together  at  each 
time  period  to  show  the  total  people  on  board.  The 
cash  flow  rate  is  the  annualized  spending  rate  at 
that  instant  in  time  (assuming  an  average  burdened 
cost/MY  of  $50,000).  The  last  column  gives  the 
cumulative  cost  at  each  two  month  interval.  Note 
that  the  total  cumulative  cost  is  greater  than  the 
predicted  software  development  cost.  This  is  be¬ 
cause  we  have  included  the  phase-out  cost  of  Phase  I 
effort  plus  the  hardware  planning  integration  ef¬ 
fort. 


Figure  3  shows  the  time-phased  manloadinq  of 
the  Phase  II  part  of  the  project  as  laid  out  in 
Table  6.  Note  that  the  overall  curve  is  slightly 
distorted  from  the  nominal  software  development 
effort  because  of  Phase  I  phase-out  and  hardware 
planning  efforts.  Milestones  are  also  shown  on  this 
figure  to  graphically  portray  where  these  should 
occur. 

Li n ear  Pronramming  Alternative 

An  alternative  method  for  the  Rayleigh  para¬ 
meter  determination  is  linear  programming.  Since 
we  are  dea'ing  with  only  two  unknowns,  K  and  t.j, 

and  have  a  number  of  constraint  conditions  involving 
these  parameters,  we  can  easily  turn  it  into  a  two 
dimensional  linear  programming  problem  which  can  be 
solved  graphically.  The  nice  feature  of  this  ap¬ 
proach  is  that  a  number  of  the  constraints  can  be 
expressed  directly  in  management  terms.  Design  to 
cost  and  design  to  contract  time  is  possible  within 
the  constrained  optimization  procedure.  This  pro¬ 
cedure  is  outlined  below.  Ihe  following  constraint 
conditions  apply; 

S.  =  r.K  t^22  Software  equation 


K/td 

*  c  y 

J  max 

Maximum  peak  manpower 

K/td 

>  y 

■'max 

Minimum  peak  manpower 

K/t2 

■  |D  | 

Maximum  difficulty 

K/t2 

■_  t'.’Dl 

Maximum  difficulty  gradient 

‘d 

;  contract 

delivery  time 

$/MY  (,4k)  ■  Total  budgeted  amount  for  develop¬ 
ment 


POWER 


0 


0 


6 

.5 


118 


These  constraint  conditions  can  be  linearized 
by  taking  logarithms  and  using  the  simplex  method 
of  solving  the  linear  programming  problem.  The 
simplest  objective  functions  are  cost  and  time.  Ono 
generally  wants  to  minimize  one  or  the  other  of 
these.  Typically  we  do  both  and  then  trade-off  in 
the  region  in  between. 

Assume  these  constraints  applied  to  SAVE: 

Number  of  S$  =  9847S 

Maximum  development  cost  1  $2  million 

Maximum  time  (contract  delivery)  2  years 

Maximum  manpower  available  at  peak  manninq 
(hiring  constraint,  say)  28  people 

Minimum  manpower  you  desire  to  employ  at 
peak  manning  •  15  people 

Maximum  difficulty  gradient  c  15 

Maximum  difficulty  _  50 

Minimum  productivity  >  2000  S  /  MY 


Trade-off  is  possible  along  the  5r  line  between 
td  =  1.83  years,  K  =  84  MY  and  td  -  2  years, 

K  =  61  MY  without  violating  constraints. 

Here  it  is  easy  to  see  the  counterintuitive 
nature  of  productivity.  Note  that  productivity  in¬ 
creases  with  development  time  because  the  require  d 
effort  (E)  goes  down  as  time  is  increased. 

One  other  point  is  important.  If  the  techno¬ 
logy  constant  is  smaller,  the  S  =  98475  line  would 

shift  parallel  to  the  right  (direction  of  increas¬ 
ing  time).  If  the  constraints  remained  numerically 
the  same,  the  feasible  region  would  change  because 
of  the  relocation  of  the  S  line.  The  time  con¬ 
straint  could  probably  notoe  met  and  a  relaxation 
of  that  constraint  would  have  to  be  sought. 

This  is  a  deterministic  solution.  However,  by 
extending  tne  idea  of  simulation,  the  linear  pro¬ 
gramming  concept  can  be  embedded  within  a  simulation 
and  the  uncertain  constraints  can  be  allowed  to  vary 
randomly  about  their  mean  values  and  the  statistical 
uncertainty  for  the  minimum  time  and  minimum  cost 
solutions  can  be  obtained  by  running  the  problem  a 
few  thousand  times. 


These  translate  into: 


Results  of  the  Example 


1/3  log  K  +  4/3  log  trf  =  log  98475  -  log  10040 


log 

K 

= 

log 

(2  x  106/5  x  104 ( .4) ) 

log 

ld  = 

log 

2 

log 

K  - 

log 

*d  - 

log 

(Je  28) 

log 

K  - 

log 

ld  = 

log 

[y'e  15) 

log 

K  - 

3  log 

*d  = 

log 

15 

log 

K  - 

2  log 

ld  = 

log 

50 

log 

K 

= 

log 

(98475/  .4(2000)) 

The  intersection  of  these  lines  bound  the 
feasible  region.  An  optimal  solution  will  be  at 
some  intersection  point.  Further,  because  of  the 
equality  constraint  it  must  be  along  the  $s  -  98475 

line.  The  limiting  conditions  in  this  case  are: 
td  ^  is  2  years,  maximum  peak  manpower  <  28  people 

and  Ss  =  98475  source  statements.  Figure  4  shows 

the  solution. 

Reading  off  the  solutions  we  see  that: 
td  K  E  PR 


(yrs) 

(MY) 

(MY) 

(S„/MY) 

3 

Minimum 

Time  1 -83 

84 

33.6 

98475/33.6  «  29". I 

Minimum 

Cost  2.0 

61 

24.4 

98475/24.4  =  4036 

! _ 

S1.22M 

We  have  shown  that  the  management  questions 
posed  at  the  beginning  can  be  answered  quantita¬ 
tively  to  acceptable  engineering  accuracy  for  a 
software  project  during  the  specification  prepara¬ 
tion  phase.  We  need  only  know  the  state-of-tech- 
nology  we  are  going  to  apply  to  the  deveU  ment, 
estimate  the  number  of  lines  of  code  using  the  PERT 
techniques,  and  use  the  software  equation  with  a 
constraint  relationship  to  solve  for  the  manage¬ 
ment  parameters  ( K, t^)  of  the  Rayleigh/Norden  equa¬ 
tion.  Simulation  provides  suitable  statistics  for 
risk  estimation. 

IV •  A  Look  at  Fundamentals 

Having  shown  some  of  the  consequences  of  tne 
Rayleigh  equation  and  its  application  to  the  soft¬ 
ware  process  let's  examine  why  we  have  selected  this 
model  rather  than  a  number  of  other  density  functions 
that  could  be  fitted  to  the  data.  We  return  to 
Norden's  initial  formulation  to  obtain  the  differen¬ 
tial  equation. 

Norden's  (8)  description  of  the  process  is: 

The  rate  of  accomplishment  is  proportional  to 
the  pace  of  the  work  times  the  amount  of  work  re¬ 
maining  to  be  done.  This  leads  to  the  first  order 
differential  equation 


Figure  4 
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S'  =  2  at  (K-y) 

where  y  is  the  rate  of  accomplishment,  2at  is  the 
"pace"  and  (K-y)  is  the  work  remaining  to  be  done. 

2 

Making  the  explicit  substitution  for  a  =  1/2  td 

we  have  y  =  t/t2  (K-y).  t/,.2  is  Norden's  linear 

ld  ld 

learning  law. 

This  is  also  the  form  of  a  generalized  diffu¬ 
sion  process  related  to  Fick's  equation  of  diffusion 
of  mass  transfer,  Newton's  equation  of  viscosity  in 
momentum  transfer  and  Fourier's  equation  of  heat 
conduction  (21,22). 

The  development  by  Mahajan  and  Schoeman  (21) 
of  the  time  pattern  of  the  diffusion  process  is 
sketched  below  in  the  context  of  the  spread  of  an 
innovation: 

f(t)  is  the  proportion  of  adopters  at  time,  t 

F(t)  is  the  cumulative  proportion  of  adopters 
at  time  t 

F  is  the  limit  of  the  cumulative  fraction  of 
adopters 

«.)  -  -  ££1 

Now  the  rate  of  diffusion  at  any  time  is  pro¬ 
portional  to  the  potential  adopters  available  at 
that  time,  or,  in  other  words,  as  the  cumulative 
number  of  adopters  approaches  its  ceiling,  F,  the 
rate  of  diffusion  decreases  proportionately.  Thus 

f(t)  =  ^  F(t)  %  (F-F(t) ) 

There  is  a  "constant"  of  proportionality  which  will 
be  labelled  g(t)  to  indicate  it  may  be  a  function 
having  a  time  dependence.  Then 

f(t)  =  g ( t )  .  (F-F(t) ) . 


This  is  the  rate  equation  of  the  diffusion  pro¬ 
cess.  One  initial  condition  is  required  to  solve 
it.  "The  rate  of  diffusion. .. is  controlled  by  g(t), 
the  value  of  which  depends  on  the  specific  innova¬ 
tion,  the  social  system  in  which  it  is  diffu  ed  and 
the  channel  and  change  agents  used  to  diffus  it: 
g(t)  =  f  (innovation,  social  system,  channels, 
change  agents).  For  a  given  innovation  in  a  speci¬ 
fic  social  system,  the  use  of  effective  channels 
and  change  agents  may  be  catalytic,  affecting  g(t) 
and,  hence,  the  rate  of  diffusion  —  g(t)  may  also 
be  interpreted  as  the  probability  of  an  adoption  at 
time  t.  Since  F-F(t)  is_the  proportion. . .not  adopt¬ 
ed,  the  product  g(t)  .  (F-F(t))  gives  the  expected 
proportion  of  adopters  at  time  t"  (21). 


We  translate  this  into  software  terms  and  the 
Rayleigh  parameters. 

f(t)  is  y  the  rate  of  accomplishment,  or 
manloading. 

F(t)  is  y  the  cumulative  fraction  complete, 
or  the  cumulative  effort. 

F  is  K  the  ceiling,  or  the  life  cycle  effort. 

g ( t )  is  Norden's  "pace"  (2at)  and  a  function  of 
the  system  (innovation),  the  software  organization 
(social  system),  tools,  constraints,  state-of-tech- 
nology  being  applied  (channels),  and  the  customer 
providing  requirements,  specifications  and  changes 
thereto  (change  agent). 

Writing  the  diffusion  equation  in  these  Ray- 
leigh/Norden  terms  we  have 

y  =  g(t)  (K-y) 

2 

j  =  9(t)  K-K(l-e'at  ) 

2 

9  *  g(t)  Ke‘at 

If  g(t)  is  equal  to  Norden's  linear  learning  law 
then  g(t)  =  2at  =  t/.2  where  n(t)  may  be  interpre- 
rd 

ted  as  the  probability  density  for  system  develop¬ 
ment  being  completed  at  td.  g(t)  may  also  be  thought 

of  as  the  innovation  frequency  or  the  group  problem 
solving  frequency. 

Substituting  2at  for  g(t) 

2 

we  have  y  =  2  K  ate~at  , 

the  derivative  form  of  the  Rayleigh  equation. 

Writing  the  equation  in  the  diffusion  form,  we 
have  }  -  t/.2  (K-y).  We  differentiate  or.ce  more 
a 

with  respect  to  time  and  obtain 

y  =  K/  2  -  y/  2  r  t(-y)/  2  ,  or 
cd  ld  td 

y  +  t/,2  y  +  y/  2  =  K/  2  =  D  which  has  also 

ld  zd  td 

been  obtained  in  a  somewhat  different  development 
by  Fix  (23).  This  is  a  form  of  what  we  shall  call 
the  dynamic  software  equation.  The  Rayleiah  inte¬ 
gral  is  its  solution  which  can  be  verified  by  direct 
substitution.  Note  that  this  equation  is  similar  to 
the  non-homogeneous  2d  order  differential  equations 
frequently  encountered  in  mechanical  and  electrical 
systems.  There  are  two  important  differences.  The 
forcing  function,  D  =  K /  2,  is  a  constant  rather 

^ A 


4 


The  software  differential  equation 


than  the  more  usual  sinusoid  and  the  y  term  has  a 
variable  coefficient,  t/t2,  proportional  to  the  1st 
cd 

power  of  time.  The  differential  equation  immediately 
suggests  the  electrical  and  mechanical  analogs.  The 
electrical  circuit  is  that  for  charging  a  capacitor 
with  a  constant  voltage  applied  through  a  variable 
resistor  whose  resistance  increases  linearly  with 
time. 


We  may  also  think  of  this  as  simple  low  pass 
filter. 


This  comparison  will  become  important  later  after  we 
relate  the  differential  equation  more  closely  to  the 
software  process. 

Systems  Have  a  Bandwidth 


It  was  stated  earlier  that  the  fundamental 
Rayleigh  equation  is  the  design  and  coding  rate 
curve  which  is  proportional  to  the  rate  of  code  pro¬ 
duction.  The  second  order  differential  equation  for 
this  process  expressed  in  manpower  terms  is 


*1 


The  sketch  below  sliows  the  relation  of  the  y^ 
curve  to  the  overall  manloading  curve,  </. 


h  +  6t't 2  h  +  6/tH2  *1  =  K't  2 
d  d  d 

is  really  a  powerful  narrow  band  filter.  One  can 
see  this  by  perturbing  the  right  hand  side  with  a 
large  magnitude  but  fast  sinusoid.  Very  little  rip¬ 
ple  shows  up  on  the  curve  and  yj  is  very  smooth. 

A  large  random  perturbation  on  the  right  hand  side 
responds  similarly,  i.e.,  the  system  filters  out 
all  but  the  narrow  band  of  frequencies  the  system  is 
sensitive  to. 

This  means  that  managers  have  to  input  work 
patterns  that  have  frequencies  that  will  be  seen  by 
the  system,  i.e.,  low  frequency  step  functions  vs. 
high  frequency  pulse  functions  (this  may  imply  that 
short  duration  crash  catch-up  efforts  are  largely 
ineffective;  in  any  event  they  have  no  long  term 
cost  and  time  increases). 

Armed  with  these  hints,  one  can  take  the  Fourier 
transform  of  y'j  curve  and  obtain  the  spectrum  of 

frequencies  the  system  will  respond  to.  The  Fourier 
transform  of  a  Rayleigh  equation  in  time  is  itself 
Rayleigh  in  frequency  (radians/year).  The  transform 
is  given  in  tables  of  Fourier  Sine  Transforms.  Plot¬ 
ting  the  spectrum  shows  that  it  is  indeed  narrow  and 
responds  significantly  only  to  signals  having  periods 
from  2  to  8  years,  i.e.,  step-like  functions,  again 
for  systems  in  the  range  of  interest  (1  It  d 14  years), 

In  general,  the  pulse  width  of  a  management  action 
has  to  be  on  the  order  of  twice  the  development  time 
to  have  long  term  consequences  on  the  system. 

This  leads  naturally  to  the  idea  that  software 
systems  have  a  characteristic  system  band  width 

(Bs  o,  .6/td).  Then  one  can  look  at  tliu  problem  in 

in  terms  of  the  signal  to  noise  ratio  and  use  the 
results  given  in  texts  on  random  signals  and  noise 
(17,18,19,20). 


Uses  of  the  Software  Equation 

The  differential  equation  y  +  t/tjj  y  +  1/tjj  y 
2 

'  K/td  is  very  useful  because  it  can  be  solved  step- 

by-step  using  the  Runge-Kutta  solution.  The  solu¬ 
tion  can  be  perturbed  at  any  point  by  changinci 
2 

K/td  =  D,  the  difficulty.  This  is  just  what  happens 

in  the  real  world  when  the  customer  changes  the  re¬ 
quirements  or  specifications  while  development  is  in 
process.  If  we  have  an  estimator  for  D  (which  we  do) 

that  relates  K/tjj  to  the  system  characteristics,  say 

the  number  of  files,  number  of  reports  and  number  of 
application  programs,  then  we  can  calculate  the 
change  in  D  and  add  it  to  our  original  D,  continue 
our  Range-Kutta  solution  from  that  point  in  time  and 
thus  study  the  time  slippage  and  cost  growth  conse¬ 
quences  of  such  requirements  changes.  Several  typi 
cal  examples  are  given  in  reference  13. 
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When  we  convert  this  differential  equation  to 
the  design  and  coding  curve  by  substituting 

td  /  lb,  we  obtain 

V\  +  6t/t2  S'!  *  6/t2  Vi  *  6Ky  2, 
d  d  td 

multiplying  this  by  the  PR  and  conversion  factor  as 
before,  we  obtain  an  expression  that  gives  us  the 
coding  rate  and  cumulative  code  produced  at  time  t. 


$s  +<w2  +  * 2-49  ■  PR  ■  K/td 

=  2.49  Cn  (K/t2)1/3 

$  +  -IV-  S  +  V  S  =  2.49  cn  (D)1/3 

td2  tdc  n 

Since  D  is  explicit,  this  equation  can  also  be 
perturbed  at  any  time  t  in  the  Runge-Kutta  solution 
and  changes  in  requirements  studied  relative  to 
their  Impact  on  code  production.  Of  course  the 

earlier  expressions  for  Ss  and  Ss  could  do  the  same. 

An  example  of  code  production  for  SIOPERS  using 
the  Range-Kutta  solution  is  given  below. 


V.  Summary  of  Estimating  Techniques 
at  Various  Stages  of 
_ the  Life  Cycle _ 

The  following  tables  summarize  the  techniques 
that  have  been  outlined  and  illustrated  earlier  in 
the  paper.  The  adaptive  filtering  technique  of  Box 
(13,15)  has  not  been  repeated  herein.  It  is  impor¬ 
tant.  Since  it  is  adequately  covered  in  the  refer¬ 
ences,  it  is  sufficient  to  say  that  the  Rayleigh/ 
Norden  manpower  curve  is  linearized  by  taking  loga¬ 
rithms.  Then  the  actual  manpower  (or  coding  rate) 
data  can  be  fit  by  least  squares,  or  plotted,  to  de¬ 
termine  the  parameters  k  and  t^.  This  technique  has 

been  applied  to  several  hundred  systems  and  works 
well,  providing  there  is  sufficient  time  history. 


So,  even  though  we  have  to  use  some  sophisti¬ 
cated  techniques  to  solve  the  problem  conceptually, 
the  actual  implementation  and  obtaining  of  answers 
is  mostly  grocery  store  arithmetic  with  a  little 
graphing  (except  for  the  Runge-Kutta  solution  which 
should  be  done  by  computer  or  programmable  calcula¬ 
tor;  simulation  should  be  done  on  a  machine;  Box 
method  can  be  done  graphically  very  nicely  with  good 
accuracy. ) 

We  have  shown  that  the  management  questions  can 
be  answered  before  software  development  starts  and, 
perhaps  more  importantly,  we  can  continually  up¬ 
date  and  converge  to  the  true  behavior  by  adaptively 
using  the  real  data  stream  in  a  dynamic  way  through¬ 
out  the  life  cycle  of  the  system.  Thus,  both  pre¬ 
diction  and  control  are  possible  to  obtain  and  use 
in  an  engineering  context. 

Finally,  we  sum  up  with  a  few  observations  im¬ 
portant  to  software  managers. 

Software  lessons  for  Executives 

•  SOFTWARE  DEVELOPMENT  IS  DYNAMIC 

—  NOT  STATIC 

•  PRODUCTIVITY  RATES  ARE  CONTINUOUSLY 

VARYING  —  NOT  CONSTANT 

•  PRODUCTIVITY  RATES  ARE  A  FUNCTION  OF 

THE  SYSTEM  DIFFICULTY  -- 
MANAGEMENT  CANNOT  ARBITRARILY 
INCREASE  PRODUCTIVITY 

•  BROOKS'  LAW  GOVERNS  --  TIME  AND  MAN¬ 

POWER  ARE  NOT  FREELY  INTERCHANGE¬ 
ABLE.  (SHORTENING  THE  "NATURAL" 
DEVELOPMENT  TIME  OF  A  SYSTEM  IS 
VERY  COSTLY  --  AND  MAY  BE 
IMPOSSIBLE) 
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Table  7. 


Runge-Kutta 

Solution  to  the  Coding  Rate  Differential  Equation 

for  SIDPERS 

t  (years) 

Coding  Rate 
(Ss/year ) 

(000) 

Cumulative  Code 

<Ss> 

(000) 

0 

0 

0 

.5 

52.8 

13.6 

1.0 

89.2 

50.0 

1.5 

101 .0 

98.6 

2.0 

90.8 

147.0 

2.5 

68.4 

187.0 

3.0 

44.2 

2150 

3.5 

24.9 

223.0 

— -  3.65 

20.33 

236.0  - — 

— —  Actual  size 

4.0 

12.3 

241 .0 

at  extension 

4.5 

5.36 

246.0 

was  close  to 
this 

5.0 

2.09 

247.0 

SIOPERS  parameters 

K  =  700  MY 

td  =  3.65  years 

D  =  52.54  MY/yr 

PR  =  914  Ss/MY  (burdened) 

DIFFERENTIAL  EQUATION 

s'  +  t  S  +  1  S  =  2.49.  PR  .  D  =  2.49.(12009D'Z/3).D 

/3J55\2  /  3.65  \  2  *  2.49  .  (12009)  .  (D)1/3 

\  JT I  \  /6  /  =  29902  .  (52.54)1/3 

=  111785 
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Table  8. 

SUMMARY  OF  ESTIMATING  EQUATIONS 


Phase 

Technique 

Equation 

Feasibility 

Rayleigh  parameter 

(1)  Ss  =  Ck  K1/3  t//3 

Functional 

estimation  by: 

Design 

(a)  Simulation 

(b)  Linear  rrogramming 

. 

(2)  K/td3  =  h’D| 

(3)  K/td2  =  |D] 

(4)  K/td  •  /r  ymax 

Solve  (1)  with  (2),  (3)  or  (4) 

(a)  &  (b)  are  conceptually  the 

same. 

Development 

Initially,  use  result 

ln(y/t)  =  (K/t2)  -  1  t2 

above: 

Box  Technique: 

then  fit  actual  overall 

2t  ^ 

manpower  using  Box  tech¬ 

Input  y,  t,  t2;  solve  for  td>  K  . 

nique.  Fit  code  produc¬ 

tion  rate  using  same 
technique.  This  will  up¬ 
date  parameters.  Generate 
manloading  curve. 

Input  Ss>  t,  t2;  solve  for  td//F  ,  S$ 

Operations  & 

Box  method  as  above: 

Maintenance 

correlate  with  K  =  E/ 

pp  _  Total  Ss 

td  actual,  since  these 
data  are  available  from 
development. 

E 

Development, 

Requirements  change.  Use 

y  +  t/td2  y  +  l/td2  y  =  K/td2  +  AD 

Operations  & 

Range-Kutta  method  to 

Maintenance 

access  future  impact. 

AD  is  usually  positive;  exception  is 
when  a  portion  of  system  is  killed 
or  taken  away 

Another  table  will  help  to  specify  the  solution  in  terms  of  the  management  questions 

Table  9. 

SUMMARY  OF  ESTIMATING  TECHNIQUES  FOR  USE  BY  MANAGERS 


How  much  will  it  cost? 


How  many  people  at  any  time? 


Cash  flow? 


What  kind  of  skills? 


What  are  the  risks? 


Is  calculated  t,  i  contract  time?  (Yes  -  OK) 
d 

Is  calculated  cost=$/MY( ,4K)  <.  contract  cost? 
(Yes  -  OK) 

Is  difficulty  and  difficulty  gradient  reasonable 
for  class  of  work? 

| D  |  .150  unless  there  is  a  good  precedent  and 
lots  of  parallelism  can  occur. 

|  V  D  |  -  7  new 

15  standalone 
27  rebuild 
55  composite 

Is  C^  consistent  with  past  experience,  type 
system,  language,  tools,  machine  environment? 
(Yes  -  OK) 


Dev.  Cost  =  $/MY(.4K)  _ 


Dev.  time  *  td 

where  K,  td  come  from  S$  =  Ck  K^3  t^3 
and  simultaneous  solution  of 

K/t/  -  |D| 

K/td3  =  |  VD  | 


Max  cost  =  I/MY( . 4 K ) 


.  contract  time 


y  (t)  =  K/  2  .  t  .  e”t  /2td 

_ ji _ _ _ 

$  (t)  =  I/MY  .  y  (t)  ( ■$/ YR ) 


Subjective  by  manager 


(MY/YR) 


What  are  the  trade-offs? 


What  is  the  cost  and  schedule  impact  of  a 
requirements  change  (during  development)? 


E  = 


\Ck> 


37my 


$  oev  = 


.4 


How  do  constraints  affect  the  answers? 

If  constraints  cannot  be  met,  trade-off  law  has 

to  be  invoked  to  get  K,  td  in  a  reasonable 

range.  Lengthening  td  will  usually  do  this.  If 

can't  meet  time,  gradient  constraint,  project  is 

not  do-able  as  is.  Size  of  system  (S^)  nas  to 

be  reduced. 

At  the  time  of  impact,  solve 
y  +  t/t.^  y  +  l/t^2  y  =  D  +&D  for  y,  y.  Plot 
for  all  t,  compare  with  previous  y  curves. 

^new^  ^  ^new 

yold^  a  ^  '  Kold 

Knew  -  Kold  =Ak 
AE  =  AK  (.4). 


A  LC  Cost  =  S7M7  .  AK;  A  DEV  cost  =  $7MY  .  AE 
y  new  (peak)  gives  new  t^.  (Lagrangian  3 
point  interpolation  can  give  this  precisely.) 
then  Atrf  =  slippage 

=  t .  -  t . 

°new  “old . 
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K. 


t 


d 


E 


y 

Jmax 

K/td2 

K/td3 


Rayleigh/Nordon  life  cycle  effort 
parameter;  work  units,  man  years,  man 
months,  etc. 


Rayleigh/Norden  time  parameter.  Time 
which  peak  manpower  normally  occurs  for 
large  software  projects.  Mathematically 
the  peak  of  the  curve. 


e-t2/2td2. 


Development  effort.  Area  under 
Rayleigh  manpower  curve  up  to  td  (peak). 
Nominally,  E  «  ,4K. 

Peak  manpower.  Normally  occurs  at  td- 

Vmax  ■  -JL- 

*  ld 


System  Difficulty,  a  force-like  term. 
Ratio  of  effort  to  development  time 
squared. 

Proportional  to  magnitude  of  difficulty 

gradient.  VD  =  i  (-2K/.  3)  +  j(l/.  2) 
ld  zd 


3  2 

-  2K/td  since  l/td  is  always  a  small 
number  in  the  systems  context. 


Jffi 

$  DEV 


y 

y 

y 


No.  of  delivered  lines  of  executable 
source  code.  Does  not  include 
comments.  What  the  programmer  writes 
as  opposed  to  machine  language  in¬ 
structions. 

A  state  of  technology  constant.  Re¬ 
lates  to  the  software  equation 

Ss  =  K^3  td^3.  A  channel  capacity 

constraint  in  the  information  theory 
sense.  (May  be  thought  of  as  the 
diameter  of  the  "pipe"  regulating  the 
"flow"  of  source  code). 

Average  dollar  cost  per  man  year  of 
effort. 

Development  cost  in  dollars. 

$  DEV  =  I7MY  .  E. 

Cumulative  effort  at  time  t. 

(man  years) . 

Manpower  at  time  t.  (MY/YR). 

Rate  of  change  of  manpower  at  time 

t.  A  force-like  term,  (man  years  per 
2 

year  ) . 


Pi?  -  Productivity.  Defined  as  the  total 

number  of  source  statements  (Ss )  divided 

by  the  development  effort  (E). 

Pi?  *  ^s  =  source  statements/MY. 

E  .4K 

This  is  generally  a  burdened  number 
which  includes  overhead  (non- 
programming)  effort. 
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ABSTRACT 

This  paper  sumarizes  some  of  the  lessons  we  learned  recently 
in  developing  a  software  cost  estimation  model  for  TRW  With 
respect  to  the  AIRM1CS  Workshop,  we  will  concentrate  on  three 
issues  which  we  found  particularly  important  and  useful  to  ad¬ 
dress  in  an  industry-wide  context  at  the  Workshop.  These  issues 
are: 

1 .  There  is  a  need  to  develop  a  set  of  well-defined,  agreed-on 
criteria  for  the  "goodness"  of  a  software  cost  model. 

2.  There  is  a  need  to  evaluate  existing  and  future  models  with 
respect  to  these  criteria. 

3.  There  is  a  need  to  emphasize  “constructive"  models  which 
relate  their  cost  estimates  to  actual  software  phenomenology 
and  project  dynamics. 

1.  CRITERIA  FOR  THE  GOODNESS  OF  A  SOFTWARE 
COST  MODEL 

Our  initial  criteria  reflected  our  primary  objective  in 
developing  the  new  TRW  Software  Cost  Estimating  Program 
(SCEP).  to  serve  as  an  aid  in  developing  cost  estimates  for 
competitive  proposals  on  large  Government  software  projects. 
One  may  have  other  objectives  for  developing  a  cost  model  e.g. 
to  evaluate  the  impact  of  using  new  techniques  which  may 
require  a  somewhat  different  set  of  criteria. 

Here  is  a  list  of  the  criteria  we  found  important  in  our  context: 

1.  Definition.  Has  the  model  clearly  defined  which  costs  it  is 
estimating,  and  which  costs  it  is  excluding? 

2.  Fidelity.  Are  the  estimates  close  to  the  actual  costs  ex¬ 
pended  on  the  projects? 

3.  Objectivity.  Does  the  model  avoid  allocating  most  of  the 
software  cost  variance  to  poorly-calibrated  subjective 
factors  (e  g.  complexity)?  That  is,  is  it  hard  to  juggle  the 
model  to  get  any  result  you  want? 

4.  Constructiveness.  Can  a  user  tell  why  the  model  gives  the 
estimates  it  does?  Does  it  help  him  understand  the  software 
job  to  be  done? 

5.  Detail.  Does  the  model  easily  accommodate  the  estimation 
of  a  software  system  consisting  of  a  number  of  subsystems 
and  units?  Does  it  give  (accurate)  phase  and  activity  break¬ 
downs? 

6.  Stability-  Do  small  differences  in  inputs  produce  small 
differences  in  output  cost  estimates? 

7.  Scope.  Does  the  model  cover  the  class  of  software  projects 
whose  costs  you  need  to  estimate? 

8.  Ease  of  Use.  Are  the  model  inputs  and  options  easy  to 
understand  and  specify? 


9.  Prospectiveness.  Does  the  model  avoid  the  use  of  informa¬ 
tion  which  will  not  be  well  known  until  the  protect  is 
complete?* 

In  the  process  ot  developing  the  1  RW  SCI  I’  model  and 
evaluating  existing  models  (Refs.  1  - 1 0 1.  we  found  each  ot  these 
criteria  important  in  terms  of  lessons  we  we  e  learning  about 
software  cost  modeling.  Below  are  some  ol  the  results  ot  our 
evaluation  and  model  development  with  respect  to  the  first  four 
criteria. 

1.1  Definition. 

Where  we  ran  a  software  cost  model  to  support  a  cost  proposal 
one  of  the  first  questions  we  would  get  from  the  proposal 
manager  was.  “Does  this  estimate  include  the  cost  of  management? 
requirements  analysis?  training?  computer  operators’"  etc.  We 
were  somewhat  surprised  to  find  that  the  documentation  for 
most  existing  cost  models  doesn't  satisfactorily  answer  this  ques¬ 
tion.  For  the  TRW  SCEP  model,  we  found  that  the  best  solution 
was  to  define  a  standard-form  Work  Breakdown  Structure  (Fig. 
1 )  ami  use  it  to  define  which  costs  were  inclmh-tl  in  our  estimates 
and  which  were  excluded. 


Fig.  1.  Work  Breakdown  Structure  Activity  Hierarchy 


‘Clearly,  this  criterion  is  specific  to  the  objective  of  cost  predic¬ 
tion.  For  other  objectives,  such  as  technology  impact  assessment, 
a  retrospective  model  could  be  appropriate. 
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i  :  l  uU'hts 

We  found  that  xiiiic  existing  models  seemed  to  work  well  lor 
some  classes  of  software  and  not  very  well  for  others,  l  or 
example,  l  ie  2  presents  the  results  of  our  analysis  of  the  Boeing 
and  Putnam '0  estimation  models  when  applied  to  their  own 
(Boeing  and  US  Army  Computer  Systems  Command)  data.  It  is 
seen  that  the  Boeing  model  appears  to  estimate  small  projeets 
reasonably  well,  hut  gives  extreme  overestimates  on  large  pro¬ 
jects.  w  hile  the  Putnam  model  does  just  the  reverse.  It  would  he 
highly  valuable  for  the  field  if  similar  information  were  available- 
witti  respect  to  other  models  and  other  software  project 
attributes. 


ACTUAL  MAN  MONTHS  O 
ACTUAL  MAN  YlAflS  & 


F*.2.  Estimator  Fidelity  versus  Project  Si2e.  Boeing  and  Putnam  Models 


e.  Refine  the  model  parameters  via  a  two-round  Delphi  exercise 
involving  10  experienced  TRW  line  and  project  managers. 

d.  Calibrate  the  model  parameters  using  an  initial  sample  ol 
seven  completed  projeets. 

e.  evaluate  the  model  using  a  sample  of  20  completed  projeets. 
including  the  initial  seven.  This  evaluation  was  done  by  an 
independent  third  party. 


Fig.  4.  Factors  Used  in  Various  Cost  Models 


For  comparisc'-.  Fig.  3  shows  the  performance  of  the  TRW 
SCFP  model  on  20  completed  TRW  projects. 
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Fig.  3.  Example  of  TRW  Software  Cost  Model  Performance 

The  sequence  of  activities  in  developing,  calibrating  and 
evaluating  the  model  was  as  follows; 

a  Survey  the  current  state  of  the  art  in  cost  estimation.  Fig.  4 
shows  one  of  the  results  of  the  survey,  a  summary  of  the 
factors  used  in  the  major  current  cost  estimation  models, 
b  Develop  a  baseline  model.  The  phase-sensitive  nature  of  the 
model  is  based  on  a  model  developed  earlier  by  the  authors 
to  evaluate  the  cost  impact  of  new  software  technologies;* 1 
it  is  functionally  similar  to  the  Boeing  model^  in  this  regard. 


The  high  value  of  R-,  the  square  of  the  correlation  coefficient, 
should  be  interpreted  with  considerable  care.  For  example,  the 
sample  consists  of  only  medium-to-very  large  TRW  government 
contract  software  efforts;  we  3te  not  sure  how  it  performs  on 
other  types  of  software  effort.  Further,  the  calibration  is  on  past 
projects.  We  have  included  a  factor  to  cover  the  impact  of  soft¬ 
ware  technology  improvements:  however,  until  we  get  more 
experience  in  comparing  SCFP  estimates  with  the  resulting 
project  actuals,  we  must  consider  the  model  stiil  in  an  experi¬ 
mental  state. 

1.3  Objectivity. 

Figure  5  shows  one  of  the  results  of  our  analysis  of  the  RCA 
PRICK  S  model  in  September  1477.  (It  i.  possible  that  the 
model  may  have  been  subsequently  changed).  It  shows  the 
extreme  sensitivity  of  the  model  to  the  subjective  complexity 
iSDCPLX)  factor.  If  you  describe  a  project  as  "HARD  ',  the 
model  will  produce  a  cost  estimate  that  is  b-7  times  higher  than 
if  you  describe  the  project  as  "FASY".  This  is  a  huge  source  of 
variation  for  a  parameter  that  is  entirely  subjective.  As  we 
have  found  from  experience,  it  means  that  a  user  can  make  the 
model  produce  any  cost  estimate  he  wants,  simply  by  modifying 
the  subjective  complexity  factor.  (In  fact.  PRICE  S  has  a  mode 
called  ECiRP  which  will  do  this  for  you  automatically).  Doing 
this  may  solve  a  user's  short-term  pricing  problem,  but  it  hardly 
leaves  you  with  the  feeling  that  you  have  performed  any  objec¬ 
tive  and  meaningful  eosl  estimation  function  for  the  user. 

In  developing  the  TRW  SCFP  model,  we  found  that  we  were 
unable  to  avoid  including  a  complexity  factor.  However,  we  have 
made  the  complexity  rating  an  attribute  of  each  individual  unit 
in  the  software,  and  provide  users  with  a  set  of  scales  for  cali¬ 
brating  the  complexity  of  different  types  of  units.  Figure  <> 


shows  the  complexity  scale  tor  coinputalional/numcrica!  analysis 
type  units  as  an  example.  Having  such  a  scale  makes  the  complex¬ 
ity  rating  a  much  more  objective,  verifiable  attribute. 


Fig.  5.  Effect  of  Complexity  Factor  in  RCA  Price  S  Model  (Sept.  1977) 
1.4  (  oust rue tiveness 

We  use  the  I'RW  SC  I  P  model  as  a  means  of  cross-checking 
separate  cost  estimates  developed  by  project  personnel  in  differ¬ 
ent  ways.  Inevitably,  this  leads  to  differences  between  the 
various  estimates,  and  a  need  to  understand  why  one  was  higher 
or  lower  than  the  other.  Similarly,  project  personnel  want  t o 
know  in  project  terms  why  different  factor  ratings  give  them 
different  cost  estimates. 
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Fig.  6.  Example  of  Complexity  Scale:  TRW  Cost  Model 

in  terms  of  the  impact  of  a  very  high  Required  Quality  on  his 
project  activities.  In  general,  we  have  found  that  such  tables  do 
help  the  users  of  the  cost  model  better  understand  the  software 
job  they  are  preparing  for.  Similarly,  we  feel  they  provide  a  link 
between  the  “Predictive  Models"  and  “Life  Cycle  Dynamics" 
portions  of  t lie  Workshop  which  deserve  discussion  and  explana¬ 
tion  during  the  Workshop. 

1.5  Other  1  actors. 

We  have  encountered  similar  lessons  learned  with  respect  to 
other  factors  m  the  above  list:  detail,  stability,  etc.  The  length 
constraints  of  this  paper  preclude  our  discussing  them  in  detail 
here;  however,  we  hope  they  can  serve  as  topics  for  discussion 
at  the  Workshop. 


EXAMPLE  RATINGS  FQR  COMPUTATIONAL -TYPE  UNIT 


•  COMPUTATIONS  TO  EVALUATE  SIMPLE 
EXPRESSIONS:  A  «  B  *  C  *  (0  -  E. 


•  USE  OF  STANDARD  MATH  AND  STATISTICAL 
ROUTINES  AND  BASIC  MAT*  U -VECTOR 
OPERATIONS 


•  USE  OF  BASIC  NUMERICAL  ANAL  Y5IS  SKILLS 
E  G  ,  MULTIVARIATE  INTERPOLATION 
ORDINARY  DIFFERENTIAL  EQUATIONS 


•  DIFFICULT  BUT  STRUCTURFO  NUMERICAL 
ANALYSI5E  G  .  NEAR -SINGULAR  MATRIX 
EQUATIONS. PARTIAL  DIFFERENTIAL  EQUATIONS 


•  DIFFICULT.  UNSTRUCTURED  KlFtRICM 
ANALYSIS  EG.  FAST.  HlGHL  Y  ACCURATE 
ANALYSIS  OF  NOISY.  5T0CHASTIC  DATA 


In  order  to  help  answer  these  questions,  and  in  order  to 
promote  accurate  factor  ratings  by  users,  we  have  provided  a  table 
for  each  of  the  factors  in  the  I  RW  SO  P  model  that  shows  the 
impact  on  project  activities  of  the  various  factors  and  their 
ratings  1  or  example,  the  table  for  the  “Required  Quality" 
f  actor  is  shown  as  Figure  7  When  a  user  rates  his  system  as 
having  a  "Very  High"  Required  Quality,  the  SO V  model  will 
increase  the  cost  of  performing  the  various  phases  of  system 
development  by  a  certain  lactor.  which  may  vary  by  phase.  I  he 
fable  in  Figure  "*  thus  tells  the  user  why  his  costs  have  increased. 


:  SU.CihSTI  D  ISSll S  1  OR  WORKSHOP 
Rased  on  the  discussion  above,  here  are  some  suggested 
questions  which  would  be  useful  to  discuss  at  the  Workshop 
a.  What  criteria  are  important  tor  the  utility  of  a  software  cost 
estimation  model? 

h.  In  general,  how  do  current  predictive  models  stack  up  with 
respect  to  these  criteria?  Are  there  general  deficiencies  in 
current  models  (eg.,  uniform  or  comparable  definitions! 
whkh  need  attention? 


FACTO# 

RATING 

ROTS.  ANALYSES 

PRELIM.  DESIGN 

DETAIL  DESIGN 

CODE  &  UNIT  TEST 

INTEG,  &  TEST 

1.  RECUIREO 
quality 

1 

•  LITTLE  DETAIL 

•  MANY  TPD’S 

•  LITTLE  VALIO’N 
•M'NIMAL  PLANS 

'0.1  C V,  ACCEPT.) 
•MIN'MAL  5RR 

•  LITTLE  DETAIL 

•  MANY  TOD'S 

•  LITTLE  VERIF'N 

•  MINIMAL  OA.  CM.  STDS. 

•  MINIMAL  PDR 

•  NO  DRAFT  USER  MAN. 

•  BA5IC  DESIGN  INFO. 

•  MINIMAL  UOF'S 

•  minimal  oa,  cm 

•  minimal  cor 

•  NO  DRAFT  u:>ER  MAN. 

•  MINIMAL  U.T.  PLAN 

•  MINIMAL  FCL'S.  PATH 
TEST.  STDS  CHECK 

•  M'M  MAL  OA.  CM 

•  MINIMAL  1,0  AND  CPF. 
NOMINAL  TESTS 

•  MINIMAL  USER  MAN. 

•  MINIMAL  TEST  PLANS 

•  NO  TEST  PROCEDURES 

•  MANY  ROTS.  UNTESTED 

•  MINIMAL  QA.  CM 

•  MINIMAL  j"PrSS,  OFF- 
NOM  NAL  TESTS 

•  MINIMAL  AS  Ill'll T  DOCU. 

•OASIC  INFO,  VALID'N 

•FREQUENT  TBD'S 
•BASIC  PLANS 
!OA.  CM,  ACCENT.) 

•  BASIC  SR* 

•  OASIC  INFO.  VERIF'N 
•FREQUENT  TOD'S 
♦OASIC  OA  CM.  STDS 
•OASJC  PDR 
•MINIMAL  USER  MAN. 

•MODERATE  DETAIL 

•  OASIC  UDF'S,  PA, 

CM.  CDR 

•  MINIMAL  USER  MAN. 

•OASIC  U.  T.  TL  AN 
•PARTIAL  FCL'S.  PATH 
TEST  STDSCHCCK 
•OASIC  OA,  CM.  U.M. 

•  PARTIAL  l-'O  AND  OFF- 
NOMINAL  TESTS 

♦OAMC  TEST  Pj.  A  VS 
•M’N'MAL  TEST  PROCED. 

•  F  RFvJUl  NT  RUTS  UNTCSTED 
•0  AVC  OA.  CM.  U.M. 
•PARTIAL  STRESS.  OFF. 
NOMINAL  TESTS 

FULL  TRW  POLICIES 

FULL  TRW  POLICIES 

FULL  TRW  POLICIES 

FULL  TRW  POLICIES 

FULL  TRW  POLICIES 

H 

•DETAILED  VALID'N, 
PLANS,  SR# 

•  DETAILED  VERIF'N, 

OA,  CM,  STDS,  PDR, 
COC'N 

•DETAILED  VERIF'N. 

OA,  CM,  STDS.  CDR 
DOC'N 

•DETAILED  U.T.  PLAN, 
FCL'S.  QA.  CM,  DOC'N 
CODE  WAL KTMRUS 
•EXUNS1VE  01  F  . 

NOM  NAl  TESTS 

•OETAtirO  TEST  PLANS, 
I’POCF  DURE S,  CA.  CM. 

DCV.  *  N 

•EXTENSIVE  STRESS.  OFF- 
NOM  NAL  Tl  STS 

VH 

•  DET/'LEO  VALID'N, 

PL  ANS.  SRR 
•IVAV  INTERFACE 
(SUPPORT.  RESPONSE) 

•  DETAILED  VERIF'N, 

OA,  CM.  STDS,  PDR. 
DOC'N 

•  IVAV  INTERFACE 

•DETAILED  VERIF'N, 

OA.  CM,  STDS,  CDR, 
DOC'N 

•IV».V  INTERFACE 

•  Pr  T  All  ED  V  T.  PLAN. 

i F O  '■*.  C A  .  CM.  DOC  N't 
CfDI  WALK  TIROS 
•VERT  r  >.  1  :  OFF- 

NOMINAL  TESTS 

•  1 V  V  V  IN  T E  pr  ACE 

•VERY  DT  TAIL  ED  TEST 
l'i  ANS.  I 'ROC’S,  OA,  CM. 

UX  'N 

•  .  ■  EX  Tf  N*ivr  STFESS. 

Cl  f  \>M  NV.  Tf  '  FS 
•WVV  INTI  ACC 

Fig.  7.  Example  of  Factor/Rating  Definitions:  TRW  Cost  Model 


c.  lo  what  extent  do  current  predictive  models  explain  solt- 
u.i re  life-cycle  dynamics?  Are  there  ways  to  help  close  the 
yap  between  models  and  project  dynamics’ 
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Abstract 

This  paper  deals  with  the  modelling 
of  software  errors  encountered  in  a  small 
and  a  large  software  system.  A  determin¬ 
istic  analysis  of  software  failure  process 
is  presented  to  obtain  an  appropriate  mean 
value  function  for  a  non-homogeneous 
Poisson  process.  Several  quantitative 
measures  for  software  quality  assessment 
are  also  proposed.  Statistical  techniques 
of  inference  about  unknown  parameters  are 
discussed  and  detailed  analyses  of  software 
error  data  from  two  systems  are  presented. 


1 .  Introduction 

The  importance  of  modelling  and 
analysis  of  software  error  phenomena  has 
been  well  recognized  during  the  last  few 
years  and  many  studies  have  addressed  this 
problem,  see  for  example  11,4,5,6,7,8,9, 
10,12,13,14,15,16,17).  An  important 
objective  of  most  of  these  investigations 
has  been  to  develop  analytical  models  for 
the  error  phenomenon  in  order  to  compute 
quantities  of  interest  such  as  the  number 
of  errors  detected  by  some  time  t,  the 
number  of  remaining  errors  at  time  t,  the 
software  reliability  function,  and  the 
mean  time  to  software  failure.  Such 
quantities  are  useful  for  planning  purposes, 
both  in  the  development  and  the  operational 
phases  of  software  systems. 

In  this  paper  we  develop  a  time  de¬ 
pendent  model  for  software  errors  and 
illustrate  its  applicability  via  analyses 
of  error  data  from  two  software  projects. 

The  model  considered  here  is  a  non-homo- 
genious  Poisson  process  whose  mean  value 
function  is  derived  by  a  deterministic 
analysis  of  the  software  failure  process 
in  Section  2.  Several  quantities  of  inter¬ 
est  are  then  given  to  establish  quantitative 
measures  for  software  performance.  A  de¬ 
scription  of  the  first  project  (a  large 
scale  software  project)  and  error  analyses 
are  given  in  Section  3.  Software  error  data 
based  on  CPU  time  from  a  small  project  is 
given  in  Section  4. 


2 .  Time  Dependent  Model  and 
Quantities  of  Interest 

2.1  Model  Development 


Before  analyzing  the  stochastic 
behavior  of  the  software  failures,  it  is 
useful  to  make  a  simpler  analysis  ignoring 
statistical  fluctuations  in  the  number  of 
software  failures.  Suppose  n(t)  is  the 
cumulative  number  of  software  errors 
detected  by  time  t  and  n(t)  is  so  large 
that  it  can  be  treated  as  a  continuous 
function  of  t.  We  assume  that  the  number 
of  undetected  errors  at  any  time  is  finite 
and  hence  n(t)  is  a  bounded,  non-decreasing 
function  of  t.  We  further  assume  that  the 
total  number  of  errors  to  be  eventually 
detected,  n(°>)=a.  Then,  we  have 


n  (t)  = 


when  t=0 


a  when  t=~  ’ 

Now  let  the  number  of  errors  detected 
in  ( t , t+At)  be  proportional  to  the  number 
of  undetected  errors,  i.e. 

n(t+At)-n(t)  =  b!a-n(t)!At  (2) 

where  b  is  a  constant  of  proportionality. 
From  (2)  we  get  the  differential  equation 

n  ’  ( t )  =  ab-bn ( t )  .  ( 3 ) 

Solving  this  for  n(t),  we  get 

n (t)  =  a ( 1-e  bt)  .  (4 ) 

A  stochastic  analysis  of  this  phenomenon 
using  a  time  dependent  Poisson  process  (a 
non-homogeneous  Poisson  process)  and  the 
role  of  n(t)  in  such  analysis  is  presented 
next. 


Let  lN(t),tM)l  be  a  counting  process 
(number  of  errors  in  ( 0  ,  t ) ) .  The  differ¬ 
ence  between  n(t)  and  N(t)  is  that  the 
former  is  a  deterministic  number  while  the 
latter  is  a  random  variable.  The  N(t)  pro¬ 
cess  is  a  NHPP  with  intensity  function  >  (t) 
if 

(i)  N  ( 0  )  =  0 

(ii)  (N(t),t^0)  has  independent  incre¬ 
ments 

(iii)  P{ 2  or  more  events  in  (t,t+h)  l=o(h) 

(iv)  Pfexactly  1  event  in  (t,t+h)  I  = 
x (t) h+o (h) . 
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If  we  let 

m(t)  = 


!n  v ( s ) ds 


then  it  can  be  shown  that 
i  n 


Pi N (t) =n ' 


(m(t) 

n: 


m(t) 


n  •  0 . 


(5) 


(6) 


In  other  words,  N ( t )  has  a  Poisson  distri¬ 
bution  with  expected  value  ElN(t)]=m(t) 
for  tj^O  and  m(t)  is  called  the  mean  value 
function  of  the  NHPP. 


The  deterministic  model  (4)  derived 
above  has  been  found  to  be  a  good  descrip¬ 
tor  of  the  software  failure  process  when 
applied  to  actual  data  sets.  For  this 
reason  we  choose  the  mean  value  function 
to  be 

m(t)  =  a(l-e_bt) .  (7) 

Note  that 

EN(»)  =  m(«)  =  a.  (8) 

In  other  words,  the  parameter  'a'  can  be 
interpreted  as  the  expected  total  number 
of  software  errors  to  be  detected  until 
complete  debugging,  assuming  such  debugging 
to  be  perfect  whenever  an  error  occurs. 

2 . 2  Quantities  of  Interest 


We  obtain  expressions  for  the  follow¬ 
ing  quantities  to  establish  the  performance 
measures  for  software  reliability  assess¬ 
ment.  For 


Cumulative  number  of  software  errors. 
For  given  a  and  b  the  distribution  of  N(t), 
the  cumulative  number  of  software  errors 
detected  by  time  t,  is  given  by 

--bt')n  -a  ( l-e-bt) 


PIN  (t)  =n )  =  -(a-(1~e~ 


n ! 


n=0,l,2,...  (9) 

i.e.  N(t)  has  a  Poisson  distribution  with 
mean 

EN ( t )  =  a(l-e~bt) .  (10) 


=  a+a(l-e~bt)-2a(l-e_bt) 


or  Var(N(t))  =  ae  bt .  (14) 

Now  suppose  yn  is  the  number  of  errors 

found  in  a  testing  period  t  ,  i.e.  N(t  )=y  . 

c  n  n  n 

Then  the  conditional  distribution  of  fJ(t  ) 

,  n 


P(N(t  >=x|N(t  )=y  ! 
n  n  n 

or 

P(N(tn)=x|N(tn)=yn)  = 


=  P!N(»)=yn+x) 


y  +x 
_  n 


(yn+x) : 


-a 

e 


x=0 ,1,2,...  (15) 

Also 

E  l N ( tn ) 1 N ( tn ) =yn 1  =  a-yn.  (16) 

This  conditional  distribution  is  important 
for  deciding  whether  the  software  system 
under  development  can  be  released  or  not. 
The  decision  should  be  made  based  on  the 
number  of  errors  remaining  in  the  software 
because  it  plays  an  important  role  in  soft¬ 
ware  reliability. 


Reliability  function.  It  can  be 
shown  that  the  reliability  function,  R(t) , 
after  the  last  failure  occurs  at  time  s  is 
given  by 


-a(e"bs-e'b(s+t> ) 
R(t)  =  e  ' 


(17) 


This  is  the  conditional  reliability  func¬ 
tion. 


2 . 3  Estimation  of  Parameters 

For  the  case  under  consideration  the 
data  is  given  in  pairs  (y.,t^),  i=l,2,...,n 

where  y^  is  the  number  of  software  failure 
by  time  t^ .  To  obtain  the  estimates  of 

parameters  a  and  b  of  the  model  derived  in 
Section  2.1,  we  proceed  as  follows. 


Note  that 

n 

P{N(-)=n)  =  — -  e~a,  n=0 ,1,2,...  (11) 

i.e.  the  distribution  of  N(~),  the  total 
number  of  errors  to  be  detected  if  debug¬ 
ging  is  carried  out  indefinitely,  is  also 
a  Poisson  distribution  with  mean  'a'. 

Remaining  number  of  software  errors 
and  related  results.  Let  ff(t)  be  the  num¬ 
ber  of  errors  remaining  in  the  system  at 
time  t.  Then 

N(t)  =  N(- )-N(t) ,  (12) 

EN(t)  =  ae'bt,  (13) 

and 

Var (N(t) )=Var (N(-) ) +Var (N (t) )- 
-  2Cov (N ( t ) ,N(-) ) 


Property  (ii)  ,  along  with  properties 
(i) ,  (iii)  and  (iv)  of  a  NHPP,  provides  a 
complete  statistical  characterization  for 
the  NHPP  so  that  the  joint  counting  prob¬ 
ability  can  be  determined  for  any  collection 
of  times  0' t^< . . . < t  .  That  is. 


P(N(t1)=y1,N(t2)=y2, .. . ,N(tn)=yn 

)=yi-y 
,yi‘yi-l 


.11  P(N(t.)-N(ti_1)=y.-yi_1l 


n  (m(ti)-m(ti_1)  ] 


-m(tn) 


i=1  (yi-yi-i,: 


(18) 


The  likelihood  function  for  given 
data  (yi,ti) ,  i=l,2,...,n,  is 


ns 


-bt .  .  -bt .  yi  yi-l 
n  ( a (e  -e  L)  1 


denoted  by  DS1  to  DS4 .  The  description  and 
the  total  number  of  errors  detected  during 
the  formal  testing  phase  for  each  data  set 
are  given  in  Table  2. 


-bt 


-a (1-e 


) 


(19) 


Taking  logarithm  of  (19) ,  we  get 
i(a,b|y,t)  enL(a,b|y,t) 

n  n  -bt. 

=  1  i ) tna+  Z  (y.-y.  ,Hn(e  1- 

i=l  1  1  1  i=i  1  1-1 

-bt 


'bti-i. 


a  ( 1-e 


(20) 


Then,  MLE's  a  and  b  must  satisfy 

11  =  0  ll_o 

3a  U'  3b  U 


-bt 


a  (1-e 


and 


(21) 


at  e 
n 


-bt.  -bt. 

-btn=E  (yj-yi-l)  (tie  ^i-l6  > 


'bti  ~bti-l 
e  -e 


(22) 

Equations  (21)  and  (22)  can  be  solved 
numerically  to  obtainthe  maximum  likeli¬ 
hood  estimates  a  and  b. 


3 .  Software  Error  Data  Analysis  for 
a  Large  Scale  Project 

3. 1  A  Large  Scale  Software  Project 

The  data  under  study  has  been  taken 
from  a  large  scale  project  reported  in 
Thayer  et  al  [16].  This  project  represents 
an  initial  delivery  of  a  large  command  and 
control  software  package  written  in 
JOVIAL/ J 4 .  It  consists  of  115,346  total 
source  statements  and  249  routines.  Some 
other  characteristics  of  this  project  are 
summarized  in  Table  1.  The  error  data 
used  for  this  study  is  taken  from  the 
Software  Problem  Reports  (SPR's)  generated 
during  the  formal  testing  phase  of  this 
project.  The  majority  of  software  errors 
were  detected  during  Validation  (Jun  1- 
Aug  12),  Acceptance  (Aug  13-Aug  24),  and 
Integration  phases  (Aug  25-Oct  26) .  How¬ 
ever,  the  operational  data  spanning  a 
period  of  approximately  one  year  (Oct  27- 
Nov  12)  was  also  analyzed.  The  only  time 
frame  readily  available  from  the  data  was 
the  calendar  day.  The  data  also  contain 
the  mistakes  by  the  operators  and  the 
"explanatory"  errors,  i.e.  the  fix  is  a 
change  to  a  comment  statement  or  the  fix 
is  not  to  a  routine.  These  explanatory 
errors  do  or  do  not  indicate  the  type  of 
change.  Therefore,  the  original  data  have 
to  be  restructured  into  four  sets  of  data 


In  this  paper  the  number  of  software 
errors  detected  during  the  formal  testing  is 
counted  on  a  weekly  basis.  Also,  for  each  data 
set  the  software  errors  detected  during  the  first 
nine  weeks  are  eliminated  due  to  the  fact  that 
this  represents  the  period  of  increasing  number 
of  software  errors  and  we  are  interested  in  anal¬ 
yzing  the  software  failures  over  the  period  when 
they  are  decreasing.  Data  about  the  SPR's  fcr 
the  15  week  period  for  the  four  cases  (DS1  to 
DS4)  can  be  obtained  from  references 


3 . 2  Software  Error  Data  Analysis 

In  this  section  we  analyze  the  data 
sets  DS1  to  DS4  to  develop  time  dependent 
models  described  above. 

Mean  value  function.  The  simultaneous 
non-linear  equations  (21)  and  (22)  are 
solved  numerically  for  each  data  set  to 
obtain  the  estimates  a  and  b.  Thus,  for 
data  set  DS1,  the  solution  is  a=1348, 
b=0.124  and  the  fitted  mean  value  function 

is  1348  (1-e  °'124t).  This  is  also  an 
estimate  of  the  expected  number  of  software 
errors  detected  by  time  t.  A  plot  of  the 
actual  and  the  fitted  values  of  the  number 
of  error  detected  during  formal  testing  for 
this  case  is  given  in  Figure  1.  Also  shown 
in  this  figure  are  the  90%  upper  and  lower 
bounds  for  the  N(t)  process  which  can  be 
computed  from  equation  (9) .  Inspection  of 
this  figure  indicates  that  the  fit  is  very 
good.  Estimates  for  other  data  sets  are 
obtained  similarly  and  are  summarized  in 
Table  3. 


Fig.  1.  Actual  and  fitted  software  errors 
and  90%  bounds  for  the  N ( t ) 
process  for  Data  Set  DS1 


136 


TABLE  1 

Project  Characteristics  (TRW) 


Size  (Total  source  statement) 

115,346 

Number  of  routines 

249 

Language 

JOVIAL/ J  4 

Formal  Requirements 

To  function  level 

Co-contr actor 

Yes 

Subcontractor 

No 

Operating  Mode 

Batch 

Formal  Testing 

Validation  (6/1/73-8/12/73) 

Acceptance  (8/13/73-8/24/73) 

Integration  (8/25/73-10/26/73) 

Operational  Demonstration  (10/27/73-11/12/7!) 

TABLE  2 

Description  of  the  Data  Set 


Data  Set 

Description 

Total  Number  of  Errors 

6/1/73-10/26/73 

(24  weeks) 

10/27/73-11/12/73 

(22  weeks) 

DS1 

Original  Data  -  TT  -  EX1  -  EX 2 

2191 

198 

DS2 

Original  Data  -  TT  -  EX1 

2621 

263 

DS3 

Original  Data  -  TT 

4367 

540 

DS4 

Original  Data  -  TT  -  EX2 

3937 

475 

TT  represents  the  mistakes  by  the  operators. 

EX1  represents  the  explanatory  errors  which  do  not  indicate  what  type  of  change  (module, 
documentation,  compool,  data  base)  was  involved. 

EX2  represents  the  explanatory  errors  which  indicate  type  of  change. 


TABLE  3 

A  Summary  of  Data  Analyses 


- Data  Set 

Quantity  - - 

DS1 

DS2 

DS3 

DS4 

a 

1348 

1823 

3958 

3466 

mm 

b 

0.124 

0.112 

0.0768 

0.0771 

"V  JB 

Var (a) 

48.7 

62.2 

147.3 

136.6 

10.3 

Var (b) 

0.00745 

0.00643 

0.00460 

0.00492 

0.00365 

pa,b 

-0.571 

-0.648 

-0.856 

-0.855 

-0.002 

Estimated  Number  of  Remaining 

Errors  at  the  end  of  Integration 
Testing 

210 

340 

1251 

1084 

0 

Number  of  Errors  Detected  During 
Operational  Demonstration  Period 

198 

263 

540 

475 

— 

Joint  confidence  region.  To  obtain 
a  (l-n)lOO*  joint  confidence  region  for  a 
and  b  we  use  the  following  approximation: 

f !  (a,b|y,t)-i(a,b|y,t)  =  j  *2;n'  (23) 


From  (23)  and  (20)  we  get 

n  n  -bt . 

>'■  <y;-yi_1>:lo<3a+  }'  <y,-y.  ,)log(e  1_  - 
i=l  1  1  *  i=l 

-bt .  _bt 

-e  x)  -  a ( 1-e  n)  =  c  (24) 


where 
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c  =  Ma,b|X)  -  §  X 2 . a 

Confidence  regions  for  desired  values  of  a 
can  be  obtained  by  solving  equation  (24) . 
For  date  set  DSl,  the  joint  confidence 
regions  for  a,  b  for  a=.10,  .25  and  .50 
are  drawn  in  Figure  2. 


PARAMETER  b 


Fig.  2.  Joint  confidence  regions  for  a 
and  b  for  Data  Set  DSl 


Asymptotic  properties.  For  large  n, 
the  mle's  (a,b)  follow  a  bivariate  normal 
distribution  (BVN) : 


where 

given 


the  variance-covariance 
by 

t  -1 

j.  _  raa  rab 
rba  rbb . 


matrix 


(25) 
Z  is 


(26) 


and 


ab 


-E 


32l 

3a3b 


(27) 


For  data  set  DSl  the  estimated  variance- 
covariance  matrix  is 

'  =  ["2368  -0.2071  (2g) 

[-0.2071  5. 554*10“ 

From  (28)  the  standard  deviations  of  a  and 


b  are  Var(a)=48.7,  Var (b) =0 . 00745  and  the 
estimated  correlation  coefficient  is 


Values  of  Var(a),  Var(b)  and  n  , 

a ,  b 

for  data  sets  DSl  to  DS4  are  given  in 
Table  3.  Also  given  is  the  number  of 
errors  detected  during  the  Operational 


Phase.  By  comparing  the  entries  in  the  last 
two  rows,  we  see  that  the  predicted  values 
are  quite  close  to  actual  ones. 

Expected  number  of  remaining  errors 
and  confidence  bounds’]  The  expected  number 
of  remaining  errors  is  computed  from  equa¬ 
tion  (13)  for  estimated  values  of  a  and  b. 
Also,  we  can  show  that  100(l-a)%  confidence 


bounds  for  EFT ( t )  are 

given  by 

<f(a,b)±tn_2;a/2 

V( f (a,b) ) } 

(29) 

where 

f  (a , b)  =  ae  bt 

/— \ 

1  1  I  3S  1 

1  if  1 

(30) 

Vb/ 

a=a , 

b=b 

90%  confidence  bounds  for  EN(t)  for  data 
set  DSl  are  computed  from  the  above  equa¬ 
tions  and  are  shown  in  Figure  3.  Also  shown 
is  a  plot  of  the  actual  number  of  remaining 
errors  during  the  15  week  period.  From  the 
figure  we  see  that  the  actual  errors  fall 
within  the  90%  bounds. 


Fig.  3.  90%  confidence  bounds  for  the 

expected  number  of  remaining  soft¬ 
ware  errors  at  time  t  for  Data 
Set  DSl 

4 .  Software  Error  Data  Analysis 
Based  on  CPU  Time 

In  this  section  we  discuss  the  anal¬ 
ysis  of  software  error  data  collected  from 
a  small  project.  The  errors  in  this  case 
are  considered  as  a  function  of  CPU  times 
rather  than  the  calendar  time,  as  was  the 
case  for  the  data  analyzed  in  Section  3. 
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4. 1  Software  Error  Data 

The  error  data,  as  reported  by 
Wagoner  117),  were  collected  from  the 
computer  program  F-llD  which  obtains  fil¬ 
tered  navigation  solutions  from  data  taken 
by  receiving  equipment  carried  on  an  air¬ 
craft.  This  program  is  a  data  reduction 
program  consisting  of  approximately  3000 
4000  FORTRAN  statements  and  required  one 
month  for  development  and  checkout.  The 
number  of  errors  were  counted  with  CPU 
(running)  time.  Wagoner  117]  analyzed  the 
data  by  a  Weibull  distribution  with  CPU 
time  as  the  independent  variable. 


Data  Analysis 


The  analysis  procedure  for  this  data 
set  is  similar  to  that  of  Section  3.2. 


For  the  F-llD  data  given  in  (17),  the 
estimates  of  a  and  b  from  (21)  and  (22)  are 
a=107  and  b=0.0367.  The  fitted  mean  value 
function  is 


m(t)  =  107 (1-e 


-  0 . 0  3  6  7 1 , 


This  is  an  estimate  of  the  expected 
number  of  software  errors  detected  by 
time  t,  where  t  is  in  CPU  seconds.  A  plot 
of  the  actual  and  the  fitted  values  of  the 
number  of  errors  detected  during  the  devel¬ 
opment  and  checkout  phases  of  F-llD  pro¬ 
gram  in  CPU  time  is  given  in  Figure  4. 

Also  shown  are  the  90%  upper  and  lower 
confidence  bounds  for  the  N(t)  process. 
Inspection  of  this  figure  indicates  that 
the  fitted  model  satisfactory  explains  the 
actual  error  occurrence  phenomenon. 


TIME  (SECONDS) 

Fig.  4.  Plots  of  the  actual  and  fitted 
number  of  errors  and  confidence 
bounds  versus  CPU  time 


The  joint  confidence  regions  for  a 
and  b  for  u=0.10,  0.25  and  0.50  are  drawn 
in  Figure  5. 


0025  003  0035  0  0*  0045  005 

PARAMETER  b 

Fig.  5.  Joint  confidence  regions  for  b 
and  a  for  F-llD  Data  Set 

The  estimated  variance-covariance 
matrix  is 

'  107  -8.0-10-5 

'■  ~  -5  -S' 

.  -8.0x10  3  1 . 329' 10  3  J 

and  the  estimated  correlation  coefficient 

is  ;■  =-0.002. 

a ,  b 

The  90%  confidence  bounds  for  the 
expected  number  of  remaining  errors,  fCN(t), 
are  shown  in  Figure  6.  Also  shown  is  a 
plot  of  the  actual  number  of  remaining 
errors  during  250  CPU  seconds  period. 


TIME  (SECONOI 

Fig.  6.  90%  confidence  bounds  for  the 

expected  number  of  remaining 
software  errors  at  time  t  for 
Data  Set  F-llD 
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5 .  Concluding  Remarks 

In  this  paper  we  have  used  a  non- 
homogeneous  Poisson  process  to  model  the 
software  error  phenomenon  in  a  two  soft¬ 
ware  systems,  one  large  and  one  small. 

Based  on  a  deterministic  analysis  of  the 
failure  process,  we  have  derived  the  mean 
value  function  of  this  process.  Various 
quantities  of  interest  are  derived  to  pro¬ 
vide  quantitative  measures  for  software 
quality  assessment.  Estimation  of  para¬ 
meters  has  been  discussed  and  the  results 
used  to  get  fitted  models  for  several  data 
sets.  Also,  joint  confidence  bounds  for 
the  parameters  are  obtained.  These  can  be 
used  to  obtain  confidence  bounds  for  the 
performance  measures  of  the  software  sys¬ 
tem  being  analyzed. 
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ABSTRACT 

The  paper  analyzes  the  nature  of  the  laws 
that  have  been  identified  by  the  work  of  Belady, 
Lehman  and  others.  Program  maintenance  and  evolu¬ 
tion  is  planned,  managed  and  implemented  by  people, 
yet  the  laws  that  govern  the  process  are  more  akin 
to  those  of  biology  and  even  modern  physics,  then, 
as  had  been  previously  supposed,  even  more  fuzzy 
than  those  that  apply  to  economics  and  sociology. 

After  a  brief  discussion  of  the  first  four 
laws,  highlighting  the  underlying  phenomena  and 
natural  attributes,  the  paper  concentrates  on  the 
fifth  law.  It  shows  what,  how,  and  why  it  repre¬ 
sents  a  conservation  phenomenon,  the  conservation 
of  familiarity. 


INTRODUCTION 

A  recent  paper  [LEH78]  discussed  five  laws 
of  taJigZ-pKogAam  evolution  dynamics;  laws  that  have 
emerged  the  studies  of  Belady,  Lehman  and 

others  o. et*  he  last  seven  years  as  summarized  in 
the  bibliography  of  the  above  referenced  paper. 

The  main  objective  of  the  present  contribution  is 
to  discuss  one  specific  aspect  of  these  laws,  C0M- 
■iZAvation,  Some  general  Introductory  remarks  are, 
however,  desirable. 

In  the  first  place  we  should  stress  that  the 
discussion  here  is  limited  to  taAgc  programs  de¬ 
fined  by;  "a  toAgz-pAogAam  i*  one  that  ha. 4  been  im- 
ptzmzntzd  ua  madntainzd  by  at  fettif  two  indepzndz- 
ntltj  managed  groups".  Such  a  program  will  be  the 
responsibility  of  an  organization  having  two  or 
more  levels  of  management,  will  have  the  property 
of  "variety”  [BEL78]  and  will  be  outside  the  intel¬ 
lectual  grasp  of  the  individual.  A  program  not 
satisfying  the  definition  may  possess  one  or 
another  of  the  other  properties  and  may  display 
some  or  all  of  the  characteristics  of  "?aAgznZA!>^' 
[BEL78],  but  we  do  not  consider  them  here. 

PROGRAM  EVOLUTION  LAWS  WITHIN 
THE  SPECTRUM  OF  SYSTEMS  OF  LAWS 

The  evolution  of  software  systems  is  clearly 
not  a  natural  process  governed  by  immutable  laws 
of  nature.  Changes  to  a  program  are  neither  init- 
ated  nor  occur  spontaneously.  People  do  the  work; 


amend  or  emend  the  requirement,  the  specification, 
the  code,  the  documentation;  repair  the  system; 
improve  and  enhance  it.  They  do  this  in  response 
to  fault  reports,  user  requests,  business  require¬ 
ments,  managerial  directives  or  their  own  inspir¬ 
ation.  Human  thought  and  judgement  plays  a  de¬ 
cisive  role  in  the  process  that  results  from  this 
continuing  sequence  of  exogenous,  seemingly  stoch¬ 
astic,  inputs. 

Thus,  we  should  not  expect  to  discover  laws 
of  program  evolution  that  yield  the  precision  and 
predictability  of  the  laws  of  physics  [LEH77]. 

Any  laws  that  emerge  could  reasonably  have  been 
expected  to  be  even  weaker  than  biological  laws 
since  the  latter  arise  from  observation  cf  the 
collective  behaviour  of  cellular  systems  which, 
whilst  living,  are,  at  least  at  the  level  of 
human  understanding,  non-intelligent .  We  should 
not  even  expect  to  observe  behaviour  that  displays 
the  regularity  that  has  been  abstracted  into  laws 
in  the  social  and  economic  sciences,  for  example, 
the  so-called  "Law  of  Supply  and  Demand".  After 
all  the  programming  process  is  planned  and  con¬ 
trolled  by  an  organizational  and  management  struc¬ 
ture  that  is  sensitive  and  reactive  to  the  de¬ 
mands,  pressures,  circumstances  and  contingencies 
of  each  moment.  Thus,  no  regularity  should  be 
expected.  Each  action  and  event  is  surely  deter¬ 
mined  by  the  needs  of  the  moment.  The  process 
must  surely  be  completely  stochastic. 

One  of  the  first  and  most  surprising,  yet 
most  fundamental,  results  of  our  observation  and 
analysis  of  the  dynamics  of  evolution  of  some 
eight  programs,  ranging  over  a  wide  spectrum  of 
implementation  and  usage  environments,  has  been 
that  this  is  not  so.  Regularities,  trends  and 
patterns  appear  and  dominate  large  program  evol¬ 
ution.  The  common  features  and  patterns  of 
behaviour  reflect  common  characteristics  (BEL78] 
from  which  laws  can  be  deduced;  laws  which,  within 
the  spectrum  outlined,  lie  unexpectedly  somewhere 
between  the  laws  applying  to  biological  organisms 
and  those  that  emerge  from  the  study  of  socio¬ 
economic  systems.  And  these  laws  in  turn  can  be 
used  to  create  powerful,  reliable  and  cost- 
effective  life-cycle  management  tools. 

THE  UNDERLYING  CAUSE  FOR  REGULARITY 

Once  the  phenomenon  has  been  recognized, 
the  mechanisms  underlying  it  are  not  difficult 
to  understand.  In  the  first  instance,  the  program 
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and  its  documentation  in  all  their  versions  -  the 
system  -  has  a  damping  effect  analogous  to  an  ever- 
increasing  inertial  mass.  As  a  totally  unintelli¬ 
gent  mechanism,  the  computer  executes,  and  there¬ 
fore  impacts,  its  operating  environment,  precisely 
and  only  as  the  code  in  association  with  any  input 
data,  instructs.  Good  intentions,  hopes  of  cor¬ 
rectness,  wishful  thinking,  even  managerial  edict 
cannot  change  the  semantics  of  the  code  as  written 
or  its  effect  when  executed.  Nor  can  they  affect 
the  relationship  between  a  specification  and  its 
implementation,  or  that  between  both  of  these  and 
operational  circumstances.  That  is  the  freedom  of 
the  designer,  the  implementor  and  the  user  to  make 
changes  or  additions  and  to  obtain  the  desired  pro¬ 
gram  behavior  is  increasingly  constrained  by  exist¬ 
ing  ei>de,  documentation  and  past  program  applica¬ 
tion  and  behaviour.  The  code  is  unforgiving;  there 
is  no  room  for  imprecision  or  logical  error.  Thus, 
any  deviation  leads  to  a  need  for  corrective 
actions.  The  resultant  feedback  over  the  entire 
system  process  and  organization  increasingly  causes 
the  observed  regularity. 

These  facts  alone  suffice  to  explain  the  con¬ 
sistency  of  the  observations.  Other  factors  merely 
strengthen  the  phenomenon.  In  particular  large 
programs  are,  in  general,  created  within  large 
organizations  and  for  large  numbers  of  users; 
otherwise  they  could  not  be  economically  justified 
or  maintained.  Moreover,  their  very  size  and  the 
complexity  of  both  the  program  and  the  application 
for  which  it  is  intended  means  that  decisions  take 
time,  sometimes  considerable  time,  and  large 
numbers  of  people  to  implement.  The  resultant  de¬ 
lays  provide  exogenous  pressures  and  endogenous 
opportunities  for  change.  Thus  the  overall  circum¬ 
stance  and  environment  acts  like  a  filter  to  smootli 
out  the  global  consequences  of  individual  decis¬ 
ions,  whilst  also,  paradoxically,  adding  the  occas¬ 
ional  stochastic  disturbance.  It  also  acts  as  a 
brake  -  economic  and  social  -  that  inhibits  or 
softens  decisions  that  would  have  too  drastic  an 
impact.  For  example,  large  budgets  can,  in 
general,  be  neither  suddenly  terminated  nor  drasti¬ 
cally  increased.  In  practice  they  can  only  be 
changed  by  a  fractional  amount.  Similarly  a  work 
force  cannot  be  instantaneously  retrained,  relocat¬ 
ed  or  dismissed;  at  best  a  task  force  can  be  sent 
In,  and  can  cause  a  local  perturbation. 

In  summary,  large  program  creation  and  main¬ 
tenance  occurs  in  an  environment  with  many  levels 
of  arbitration,  smoothing  and  feedback  correction 
that,  in  general,  act  to  eliminate  perturbations 
at  the  output.  The  existence  of  regularity  and  of 
laws  abstracting  that  regularity  becomes  reasonable 
and  understandable. 

THE  GROSS  NATURE  OF  THF,  LAWS 

The  detailed  instantaneous  behaviour  of  the 
progratming  process  and  of  the  system  that  is  the 
object  of  its  activity,  is  the  consequence  of 
human  decision  and  action.  Specific  individual 
events  cannot  therefore  be  predicted  more  precisely 


than  can  the  specific  acts  of  an  individual.  Any 
laws  can  thus  only  relate  to  the  gross,  statisti¬ 
cal  dynamics  of  a  large  program  system  over  a 
period  of  time.  But  as  such  they  find  application 
in  system  prognosis,  planning  and  project  control. 
Equally  (or  even  more  importantly)  they  yield 
understanding  that  should  permit  improvement  of 
the  programming  process  and  advance  the  develop¬ 
ment  of  software  engineering  science  and  practice. 

FEEDBACK  CONSEQUENCES  OF  INCREASING 
UNDERSTANDING  OF  THF.  PROCESS 


Increasing  understanding  in  turn  raises 
another  problem.  To  what  extent  can  knowledge  and 
understanding  of  the  laws  that  regulate  the  pro¬ 
gramming  process  in  an  environment  unaware  or  in¬ 
sensitive  to  their  existence,  be  used  to  invalid¬ 
ate  them,  or  for  that  matter,  to  perpetuate  them 
by  appropriate  (or  inappropriate)  managerial  res¬ 
ponses?  Space  does  not  permit  us  to  address  this 
question  in  detail.  We  merely  assert  that  the 
present  laws  reflect  deeply  rooted  aspects  of 
human  and  organizational  behaviour.  Associated 
with  the  mechanistic  forces  that  define  and  con¬ 
trol  the  automatic  computational  process,  they  are 
sufficiently  fundamental  to  be  treated  as  absol¬ 
ute,  at  least  in  our  generation.  As  knowledge  of 
them  is  permitted  to  impact  t he  programming  j  re¬ 
cess,  as  programming  technology  advances,  they 
may  require  restatement  or  revision,  they  may  be¬ 
come  irrelevant  or  obsolete.  But  for  the  time 
being,  we  must  accept  and  learn  to  use,  not  to 
ignore,  them. 

THE  LAWS 

The  First  Law 

We  now  comment  briefly  on  the  laws  summarized 
in  figure  1,  so  as  to  expose  some  of  the  more  fun¬ 
damental  truths  that  they  reflect.  The  laws  have 
been  fully  discussed  in  earlier  publications 
[LEH78  and  bibliography] . 

The  Lme  C $  Cent  inning  Change  arises  from  the 
fact  that  the  world,  in  this  case  the  computing 
environment,  undergoes  continuing  change;  all  pro¬ 
grams  are  models  of  some  part,  aspect  or  process 
of  the  world.  They  must  therefore  be  changed  to 
keep  pace  with  the  needs  of  a  changing  environ¬ 
ment,  or  become  progressively  less  relevant,  less 
useful  and  less  cost  effective. 

The  Second  Law 

The  Late  0/  CemrCexittf  (an  analogue 

or  instance  of  the  second  law  of  thermodynamics) 
is  a  consequence  of  the  fact  that  a  system  is 
changed  to  improve  its  capabilities  and  to  do  so 
in  a  cost-effective  manner.  Thus  change  objec¬ 
tives  are  expressed  in  terms  of  performance 
targets,  system  resources  that  will  be  required 
during  execution,  implementation  resources,  com¬ 
pletion  dates  and  so  on.  With  multiple  objectives 
it  is  impossible  to  optimize  all  simultaneously. 
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I.  THE  LAW  OF  CONTINUING  CHANGE 

A  LARGE-PROGRAM  THAT  IS  USED  UNDERGOES 
CONTINUING  CHANGE  OR  BECOMES  PROGRESS¬ 
IVELY  LESS  USEFUL.  THE  CHANGE  OR  DECAY 
PROCESS  CONTINUES  UNTIL  IT  IS  JUDGED 
MORE  COST-EFFECTIVE  TO  REPLACE  THE  SYSTEM 
WITH  A  RE-CREATED  VERSION. 

II.  THE  LAW  OF  INCREASING  COMPLEXITY 

AS  A  LARGE-PROGRAM  IS  CONTINUOUSLY 
CHANGED  ITS  COMPLEXITY,  REFLECTING 
DETERIORATING  STRUCTURE,  INCREASES 
UNLESS  WORK  IS  DONE  TO  MAINTAIN  OR 
REDUCE  IT. 

III.  THE  FUNDAMENTAL  LAW  OF  LARGE- 
PROGRAM  EVOLUTION 

THERE  EXISTS  A  DYNAMICS  OF  LARGE-PROGRAM 
EVOLUTION  WHICH  CAUSES  MEASURES  OF  GLOBAL 
PROJECT  AND  SYSTEM  ATTRIBUTES  TO  BE  CYC¬ 
LICALLY  SELF-REGULATING  WITH  STATISTIC¬ 
ALLY  DETERMINABLE  TRENDS  AND  INVARIANCES. 

IV.  THE  LAW  OF  INVARIANT  WORK  RATE 

THE  GLOBAL  ACTIVITY  RATE  IN  A  LARGE  PRO¬ 
GRAMMING  PROJECT  IS  STATISTICALLY  IN¬ 
VARIANT.  (FOR  EXAMPLE:  NORMALLY  DIS¬ 
TRIBUTED  IN  TIME  WITH  CONSTANT  MEAN  AND 
VARIANCE). 

V.  THE  LAW  OF  CONSERVATION  OF 

FAMILIARITY  (PERCEIVED  COMPLEXITY) 

FOR  RELIABLE,  PLANNED,  EVOLUTION,  A 
LARGE-PROGRAM  UNDERGOING  CHANGE  MUST  BE 
MADE  AVAILABLE  FOR  REGULAR  USER  EXECU¬ 
TION  (RELEASED)  AT  INTERVALS  DETERMINED 
BY  A  SAFE  MAXIMUM  RELEASE  CONTENT 
(CHANGED  OR  NEW)  WHICH,  IF  EXCEEDED, 
CAUSES  INTEGRATION,  QUALITY  AND  USAGE 
PROBLEMS  WITH  TIME  AND  COST  OVER-RUNS 
WHOSE  CONSEQUENCES  MAINTAIN  THE  AVERAGE 
INCREMENT  OF  GROWTH  INVARIANT. 


FIGURE  Is  THE  FIVE  LAWS  OF  LARGE-PROGRAM 
EVOLUTION 


Hence  the  completed  project  and  system  must  repre¬ 
sent  a  compromise  that  results  from  judgements  and 
decisions  taken  during  the  implementation  process, 
often  on  the  basis  of  time,  group  or  management 
local  optimization.  Structural  maintenance,  which 
is  not  often  mentioned  in  project  objectives  since 
it  yields  no  immediate  or  visible  benefit,  will 
inevitably  suffer.  Each  change  will  thus  degrade 
system  structure  a  little  more.  The  resultant 
continuing  accumulation  of  gradual  degradation, 
ultimately  leads  to  the  point  where  the  system  can 
no  longer  be  cost-effectively  maintained  and  en¬ 
hanced  unless  and  until  a  clean-up  is  undertaken. 

The  Third  Law 

The  Fundamental  Law  of  Lange -Pnognam  evolu¬ 
tion  was  previously  called  the  Law  of  Statistically 
Smooth.  Gnowth  [LEH78] .  It  expresses  the  observa¬ 
tion  we  have  already  made  above  that  large-program 
evolution  is  not  a  purely  stochastic  process  that, 
at  each  instance,  reflects  the  decisions  and 
actions  of  the  people  in  the  environment  in  which 
it  is  maintained  and  in  which  it  is  used.  The  Law 
states  that,  at  least  in  the  current  state  of  the 
art,  there  exists  a  dynamics  whose  characterist ics 
are  determined  during  conception  and  the  early 
life  of  the  system,  the  process  and  the  organiza¬ 
tion  that  maintains  them.  The  characteristics  of 
this  dynamics  increasingly  determines  the  gross 
trends  of  the  maintenance  and  enhancement  process. 
Feedback  effects  are  an  inherent  factor  in  the 
self-stabilizing  control  process  that  evolves. 

Hence  cyclic  effects,  not  necessarily  with  a  pure 
period,  emerge. 

The  Fourth  Law 

The  Law  of  Invariant  Wonk  Rate  is  assumed  to 
reflect  a  conservation  property.  The  quantity  or 
quality  being  conserved  has  however  not  been 
clearly  identified.  The  Law  is  believed  to  be  a 
consequence  of  the  fact  that,  in  general,  human 
organizations  seek,  and  seek  to  maintain,  stabil¬ 
ity  or,  more  accurately,  stable  growth.  As  sug¬ 
gested  above,  sudden  changes  in,  for  example, 
staffing,  budget  allocations,  manufacturing  levels, 
product  types  are  avoided,  are,  in  general,  not 
possible.  A  variety  of  managerial,  union  and 
governmental  checks,  balances  and  controls  ensure 
overall  progress  to  the  ever  changing,  ever  dis¬ 
tant  objective  of  the  organization;  or  its  col¬ 
lapse.  The  Law  also,  in  a  sense,  reflects  the 
organizational  response  to  the  limitation  which, 
we  shall  show,  underlies  the  fifth  law. 

Thus  with  hindsight  it  becomes  clear  that 
the  discovery  of  some  derivative  of  an  activity 
measure  that  is  invariant,  or  better,  statisti¬ 
cally  invariant  (for  example,  normally  distributed 
in  time  with  constant  mean  and  variance)  could 
have  been  anticipated.  What  Is  not  really  under¬ 
stood  (except  vaguely  as  reflecting  the  limita¬ 
tions  of  absorbtive  capacity)  is  why  in  large- 
program  maintenance  projects,  measures  of  work 
input  rate  should  be  the  quantity  to  display  such 
invariance.  But  the  fact  remains  that  for  all  the 
systems  observed,  the  count  of  modules  changed 
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(handled)  or  changes  made  per  unit  of  time,  as 
averaged  over  each  release  interval,  has  been 
statistically  invariant  over  the  period  of 
observation. 

The  Fifth  Law 

The  Lau?  Con^eAvation  oj  FcmiUjvvlty 
( PcTtc&ivfcd  Complexity)  was  previously  referred  to 
[LEH78]  as  the  Law  Incremental  Gnouith  limit*. 
Its  discovery  was  based  on  data  from  three  of  the 
systems  observed,  each  of  which  was  made  avail¬ 
able  to  users  on  a  release  basis.  In  each  case 
the  incremental  growth  of  the  program  varied 
widely  from  release  to  release.  But  the  average 
over  a  relatively  large  number  of  releases  re¬ 
mained  remarkably  constant.  That  is  a  high-growth 
release  would  tend  to  be  followed  by  one  with 
little  or  no  growth,  or  even  by  a  system  shrink¬ 
age.  Or  two  releases,  each  of  about  average 
growth,  would  be  followed  by  one  with  only  slight 
growth.  Moreover  releases  for  which  the  net 
growth  exceeded  about  twice  the  average,  proved 
to  be  minor,  or  major,  disasters  (depending  on 
the  degree  of  excess)  with  poor  performance,  poor 
reliability,  high  fault  rates,  cost  and  time 
over-runs.  This  evidence  suggests  that  initial 
release  quality  is  a  non-linear  function  of  the 
release  incremental  growth.  From  a  more  complete 
phenomenological  analysis,  along  the  lines  out¬ 
lined  below,  and  for  which  a  mathematical  model 
needs  to  be  constructed,  we  hypothesise  that  the 
quality  is  exponentially  related  to  the  magnitude 
of  the  changes  implemented  in  the  release. 

The  phenomenon  was  detected  at  a  very 
early  stage  of  our  evolution  dynamics  studies, 
was  featured  in  our  earliest  models  [BEL71],  and 
has  been  applied  as  a  planning  and  control  para¬ 
meter  for  a  number  of  years.  The  full  explana¬ 
tion,  however,  has  only  recently  become  apparent. 

The  release  process  was  originally  iden¬ 
tified  as  a  stabilization  mechanism  [BEL71], 

Once  a  large-program  is  in  general  use  its  code 
and  documentation  are  normally  in  a  state  of  flux. 
A  fault  is  fixed  locally,  perhaps  fixed  differ¬ 
ently  or  not  at  all  In  other  installations.  Minor 
or  major  changes,  local  adaptations,  are  made. 

Code  is  changed  without  a  corresponding  change  to 
documentation.  Documentation  is  changed  to  cor¬ 
respond  to  observed  behaviour  without  a  full  and 
detailed  analysis  of  the  precise  semantics  of  the 
code  within  the  context  of  the  total  system  and 
under  all  possible  environmental  conditions. 

Only  at  the  moment  of  release  does  there  exist  an 
authoritative  version  of  the  program,  the  code 
and  its  documentat ion.  Even  this  may  include 
multiple  versions  of  modules  say,  for  clearly  de¬ 
fined  alternative  situations. 

For  an  old  release  each  Implementor,  each 
tester,  each  salesman,  each  user  will  be  familiar 
with  the  version  of  the  program  with  which  he 
has  been  associated.  This  familiarity  will  have 
bred,  not  contempt,  but  a  certain  degree  of  re¬ 
laxation,  of  ability  to  work  with  the  program  In 
order  to  accomplish  specific  objectives.  The 


program  will  be  manipulated  or  used  without 
(apparent)  need  for  concentrated  thought.  External 
perception  of  its  intrinsic  complexity  will  be  at 
a  minimum.  In  the  limit  the  program  may  be  said  to 
be  approaching  zero  perceZved  complexiXy  for  people 
working  consistently  on  or  with  it. 

As  changes  are  introduced,  as  the  new  release 
is  gradually  created  and  as  it  becomes  available, 
new  and  unfamiliar  code  appears.  The  program 
behaves  differently  in  execution,  in  its  inter¬ 
action  with  and  impact  on  the  operational  environ¬ 
ment.  Even  the  pagination  in  the  previously  fam¬ 
iliar  documentation  has  changed  and  any  need  for 
reference  entails  a  major  search.  The  system  has 
become  uncomfortably  unfamiliar,  the  degree  of  un¬ 
familiarity  depending  on  the  extent  and  inaptitude 
of  the  change  (See  below.). 

A  major  intellectual  effort  is  now  required 
by  each  person  involved  before  any  completely  suc¬ 
cessful  and  cost-effective  interaction  with  the 
new  system  can  occur.  The  system  has  suddenly 
become  complex.  Its  perceived  complexity  is  high. 

Even  those  who  participated  in  the  prepara¬ 
tion  of  the  new  release  will  normally  have  been 
involved  directly  with  only  a  small  part  of  the 
change,  a  small  portion  of  the  system.  They  too 
can  only  learn  to  understand  the  new  system  in  its 
totality  from  the  moment  of  its  completion.  More¬ 
over  until  the  complete  system  is  available  all 
acquisition  of  knowledge  and  understanding  of  the 
changes  and  of  the  new  system,  must  be  based  on 
nQA&lv\.y  of  code  and  documentation  text,  or  on 
partial  execution  of  system  components  on  test 
cases  or  system  models.  Only  with  final  integra¬ 
tion  does  the  full  executable  program  become  avail¬ 
able.  We  shall  suggest  that  when  the  release 
content  exceeds  some  critical  amount  only  opera¬ 
tional  experience  with  the  complete,  system  can 
bring  or  restore  the  degree  of  knowledge  and  fam¬ 
iliarity,  the  global  viewpoint,  that  is  essential 
for  the  cost-effective  maintenance,  enhancement 
and  exploitation  of  the  large  program. 

Thus,  in  general,  at  the  moment  of  release 
or  shortly  before  that  time  a  major  learning 
effort  will  begin  and  this  will  involve  all  those 
associated  with  the  system,  not  just  the  users. 

All  changes  and  additions  must  be  identified, 
understood  and  experienced ,  their  significance 
within  the  operational  context  of  the  total  system, 
appreciated.  Once  this  has  been  done  the  old 
degree  of  comfort  with  the  system  will  return,  its 
perceived  complexity  is  once  again  zero,  the  level 
of  familiarity  has  been  restored. 

Clearly  the  . amount  of  hard  work  that  must  be 
involved  to  achieve  this,  the  intellectual  effort 
required,  depends  amongst  other  factors,  on  the 
attitudes  of  people,  on  the  organization,  on  the 
number  and  magnitude  of  changes  introduced.  More¬ 
over,  as  already  observed,  this  dependence  must  be 
at  least  quadratic,  probably  exponential.  For 
changes  to  the  system  interact  with  one  another. 
Changes  implemented  in  the  same  release,  that  is 
in  the  same  time  interval,  must  each  be  understood 
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not  only  in  themselves,  but  also  in  the  context  of 
all  the  others,  of  the  unchanged  parts  of  the  sys¬ 
tem  and  of  past  and  future  applications.  The 
source  of  the  drastic  growth  in  difficulty  in  re¬ 
storing  the  familiarity  required  of  and  with  the 
program  if  it  is  to  be  once  again  correctly  and 
efficiently  maintained  and  exploited,  is  clear. 
Figure  2  is  intended  to  suggest  "'difficulty" 
versus  magnitude  of  release  content*  relationship. 
The  axis  of  the  curve  are  not  calibrated  since  at 
present  neither  concept  nor  suitable  measures  are 
well  defined.  But  their  relevance  to  the  Fifth 
Law  requires  us  to  analyze  the  nature  and  conse¬ 
quence  of  these  concepts.  Incidentally,  the  con¬ 
cept  of  "difficulty"  introduced  here  clearly 
relates  to  that  of  Norden  and  Putnam  fSLC77J.  But 
the  precise  relationship  has  not  yet  been  estab¬ 
lished  . 


Before  proceeding  with  the  analysis  one 
brief  remark  should  be  made.  The  Fifth  Law  as  now 
formulated  talks  about  release  content  and  its 
magnitude  (previously  incremental  growth).  The 
content  certainly  includes  new  or  changed  code,  but 
must  also  involve  deleted  code,  new  or  changed 
function  and  new  or  changed  documentation.  For 
example,  the  observations  that  inspired  the  law 
were  measured  in  terms  of  modular  growth  and  we 
have  in  fact  consistently  found  module-based  mea¬ 
sures  in  terms  of  modular  growth  and  we  have  in 
fact  consistently  found  module-based  measures  to  be 
more  accurate  and  useful  than  those  based  on  in¬ 
struction  counts.  But  there  is  at  present  no  clear 
or  agreed  measure  of  release  content,  the  magnitude 
of  a  change.  It  is  not  even  clear  that  the  concept 
possesses  a  metric.  It  must  be  left  to  the  future 
to  identify  or  define  measures  and  to  provide  an 
improved  formulation  of  the  Fifth  Lav*.  Meanwhile 
we  must  clarify  the  concepts  and  increase  under¬ 
standing  of,  at  least,  the  phenomenology,  thereby 
we  will  provide  a  basis  for  ultimate  formalization. 

With  this  clarification  we  may  now  proceed 
with  the  analysis.  Everyone*s  ability  to  master  a 
new  or  changed  object  is  limited,  though  people 
clearly  differ  in  their  capability  to  absorb  the 
new  knowledge,  to  achieve  full  understanding  of  the 
changed  program.  Thus  the  impact  of  changes  will 
undoubtedly  vary  from  person  to  person  according 
to  many  factors  that  will  include,  but  are  not 
limited  to,  their  learning  ability  and  absorptive 
capacity.  But  for  a  given  large-program  in  its 
given  environment  and  where  many  people  are  inevit¬ 
ably  involved,  the  delays  that  are  incurred,  the 
mistakes  that  are  made,  the  destructuring  that 
occurs  before  full  familiarity  is  restored,  the 
direct  and  indirect  cost  of  familiarization, 
depends  on  the  average  ability  of  all  the  people 
involved.  After  all,  while  the  above-average 
person  will  regain  mastery  more  quickly,  make 
fewer  mistakes,  achieve  a  temporary  advantage 
(which  will  probably  cause  him  to  be  promoted  or 
moved  out  of  the  project  fairly  quickly),  the 
below-average  person  will  fall  behind,  will  per¬ 
haps  lose  contact,  make  more  mistakes,  do  more 
damage.  He  may  well  be  re-assigned  to  a  Less 
demanding  role  with  less  impact  on  the  project  or 
even  fired.  But  the  damage  will  have  been  done 


and  with  hiring  policies 

established  make  up  of  the  project,  the  average 
level  will  remain  at  best,  unchanged;  more  probably 
decline  [LEH64 ] . 

For  different  organizations,  different  sys¬ 
tems,  different  structures,  different  methodolo¬ 
gies,  different  processes,  the  average  level  will, 
of  course,  be  different.  This  implies  that  any 
models  will  contain  exogenous  variables.  Thus  it 
establishes  a  potential  for  improving  the  level  in 
any  given  circumstance  once  the  phenomenon,  the 
organization  and  the  programming  process  are  under¬ 
stood. 


Difficulty 
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Given  the  above  insight  irto  the  growth  of 
difficulty  in  understanding  and  working  with 'a 
system  as  the  release  content  increases,  and  its 
consequences,  as  a  result  of  feedback  that  slows 
down  both  utilization  and  further  evolution  as  sys¬ 
tem  structure  deteriorates,  that  the  number  of 
faults  increases,  documentation  lags  and  perfor¬ 
mance  declines,  we  are  now  in  a  position  to  appre¬ 
ciate  the  Fifth  Law. 

If  the  release  content,  the  magnitude  of  the 
change  and/or  the  incremental  growth,  is  less  than 
some  threshold  region  (Figure  2)  the  integration 
and  operational  installation  of  the  new  system 
should  he  fairly  straightforward.  No  major  pro¬ 
blems  should  be  exper ienc<-d  in  mastering  the  new 
release;  it  may  well  be  that  the  change  may  be 
absorbed,  familiarity  restored  without  actual  sys¬ 
tem  operation  exposure. 

When  the  release  content  lies  in  a  threshold 
region  which  may  not  be  precisely  delineable,  qual¬ 
ity,  performance,  completion  and  installation  pro¬ 
blems  are  to  be  expected.  Slippage  and  cost  over¬ 
run  mav  occur.  A  subsequent,  low  growth,  release 
will  be  required  to  clean  up  the  system  and  restore 
it  to  a  state  that  permits  further  cost-effective 
evolut ion . 
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Finally  if  a  release  whose  content  exceeds 
the  threshold  region  is  attempted  serious  problems 
will  be  encountered.  Slippage  and  cost  over-run 
will  occur  unless  the  plans  take  account  of  the 
greatly  increased  difficulties  that  will  be  exper¬ 
ienced.  If  not  properly  planned  it  may  lead  to 
the  effective  collapse  of  the  system  or*  as  we  have 
observed  in  at  least  two  instances*  to  an  effect 
that  we  have  termed  system  £i66'tOit.  Since  only 
release  of  the  system  to  end  users  and  to  the 
developers  provides  full  exposure,  even  when  ade¬ 
quate  resources  and  time  has  been  provided,  such 
a  release  will  still  require  to  be  followed  by  a 
restoration  or  clean  up  release. 

The  Fifth  Law  abstracts  these  observations, 
adding  an  additional  factor,  that  of  the  emergence 
of  invariant  average  incremental  growth  or  content. 
The  latter  is  also  a  consequence  of  the  additional 
exogenous  pressure  for  accelerated  functional 
growth  or  content.  The  latter  is  also  a  conse¬ 
quence  that  is  a  characteristic  feature  of  large- 
program  applications  and,  in  general,  of  organi¬ 
zational  environments. 


FINAL  WORD 

One  last  word  should  be  added.  The  first 
recognition  of  the  laws  discussed  was  based  en¬ 
tirely  on  an  examination  and  analysis  of  data 
from  some  very  large  programs.  However,  once  they 
are  formulated  the  laws  must  be  examined  in  their 
own  right  to  achieve  the  transition  from  phenomen¬ 
ology  to  science.  These  laws  of  large-program 
development  and  evolution  are  now  beginning  to  be 
understood  in  this  way.  They  are  seen  to  express 
very  basic  attributes  of  computing,  of  the  pro¬ 
gramming  development,  maintenance  and  usage  pro¬ 
cesses,  of  programs  themselves,  and  of  the  organi¬ 
zations  and  environments  in  which  these  activities 
are  carried  out. 

Once  this  ir terpretat ion  of  the  laws  in 
terms  of  more  fundamental  phenomena  has  been 
achieved  we  must  re-examine  the  old  data  and 
examine  new  information  in  the  light  of  the  laws 
(X4  uildCAAtood.  Deviations  must  be  explained  and 
interpreted.  Contradictions  may  require  re-formu¬ 
lation  and  re-interpretation  of  a  law  or  even  its 
rejection.  There  is,  of  course,  nothing  new  in 
these  comments.  They  form  the  very  basis  of  the 
scientific  method.  They  are  added  here  to  assert 
the  belief  that  the  laws  as  formulated  have  been 
substantiated  by  experience  and  by  experimental 
data  to  the  point  where  they  can  stand  in  their 
own  right  until  evidence  and  developing  insight 
and  understanding  demands  their  change. 
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■Summary 

The  paper  reports  some  preliminary  results 
from  an  attempt  to  use  real  data  to  validate  the 
software  reliability  model  proposed  by  Littlewood 
and  Verrall  £  1 j .  A  goodness-of- lit  test  is 
employed  to  compare  actual  times-to-iailure  with 
the  distributions  of  times-to-failure  predicted 
by  the  model.  The  test  shows  the  model  to  fit 
with  remarkable  fidelity.  It  is  suggested  that 
this  is  evidence  in  support  of  the  author’s 
earlier  comments  i 2~ ,  ,.i( . 


1.  Introduction 

In  recent  papers  11,  5j  I  have  criticised  some 
of  the  assumptions  underlying  much  work  or  software 
reliability  measurement,  in  particular,  1  have 
suggested  that  some  authors  13,  4,  ~  (  make 

assumptions  about  the  relationship  between  dynamic 
performance  measures  (failure  rate,  distribution  of 
future  time- to -failure,  etc.,  and  static  measures 
(number  of  remaining  bugs)  which  are  extremely 
implausible.  1  have  argued,  also,  that  some  of  the 
peculiar  properties  of  software  (c.g.  lack  of 
natural  degradation,  uniqueness  cf  each  item)  might 
result  in  much  of  the  conventional  reliability 
theory,  created  ivr  hardware  systems,  being 
inappropriate.  U  is  ny  contention  that  the  model 
developed  jointly  by  Jolin  Verrall  and  myself  avoids 
some  of  these  pitfalls  -  albeit  at  the  expense  of 
other  disadvantages.  Hie  results  reported  hero  are 
the  beginning  of  an  attempt  to  validate  this  model 
using  real-life  data.  It  is  intended  to  examine 
more  closely  in  future  work  the  points  raised  in 
C2f  and  l3J,  but  these  preliminary  results  appear 
to  lend  encouraging  support. 

2.  Description  of  the  calculations 

Our  model  is  described  in  references  i 1],  .  Si 
and  19.1,  where  details  of  t lie  parameter  estimation 
procedures  can  be  found.  Briefly,  we  assume  fin 
common  with  most  workers  in  this  field)  that  the 
time  between  (i-l)th  and  ith  failures  of  the 
program  has  (conditionally)  the  exponential  pdf: 

.'dfftjxp  =  (1) 


In  order  to  model  the  reliability  grevth  which  takes 
place  as  a  result  of  the  bug-fixing  attempts,  we 
treat  (A.)  as  a  sequence  of  random  variables  with 
pdfs:  1 

ixlffijla,  v l i ) )  -  i(i){v(i)ij>a’1exp{-v(i)A.i 

r(a) 

for  A.  >  0,  (2) 

=  o,  otherwise. 

It  is  easy  to  see  that  the  unconditional  distribut¬ 
ion  of  T.  is  not  exponential") 

lhe  unknown  quantities  in  the  model  3re  a  and 
.  t. i ) .  It  is  via  the  second  of  these  that  the  model 
reflects  past  and  future  changes  in  reliability, 

>o  this  growth  function  will  be  of  particular 
practical  importance.  We  suggest  L  11  that  a  choice 
of  paninetric  family  lie  made  for  this  function, 
then  the  estimation  problem  concems  the  parameters 
of  the  function  together  with  a. 

Our  original  hope,  when  first  working  on  the 
model,  was  that  the  growth  function  parametric 
family  might  be  known  a  priori .  ibis  now  seems 
unlikely,  so  we  propose  that  several  l'am.'ies  he 
tried  anl  that  which  host  fits  the  data  be  used. 

! t  is  fortunate  that  the  Inference  procedure 
adopted  allows  a  comparison  to  be  made  between  the 
families  (maximum  likelihood  tecliniques,  for 
example ,  do  not  enable  a  choice  to  lie  made  between 
node  Is  in  tins  way). 

when  the  parameters  of  the  model  have  been 
estimated,  distributions  of  future  times-to- 
iailure  can  be  calculated.  I rom  these  it  is 
possible  to  calculate  any  reliability  measures  of 
interest,  such  as  failure  rate,  reliability 
function.  Since  the  purpose  of  this  study  is 
validation  of  the  model,  I  have  concentrated  on  the 
percentage  points  of  the  predicted  time- to- failure 
distributions. 

Iahlc  1  lists  the  13d  observations :  those  are 
the  successive  execution  times  'in  seconds i ,  .  ich 
terminating  with  a  software  failure.  Notice  the 
great  variability:  a  feature  of  software  reliability 
data  which  must  be  reflected  hv  :iny  good  model. 

figure  1  is  an  example  of  part  of  the  output 
from  the  program  which  performs  the  calculations, 
in  tliis  case,  the  first  3S  observations  were  used 
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in  the  calculation,  and  the  growth  function  being 
used  was  linear:  *(i)  =  8.  ♦  8-,i.  there  are  thus 
three  unknown  parameters.  87,  8,,  which  are 
estimated  using  the  35  data  points:  !he  routine 
which  estimates  these  a 's  (a  is  taken  care  of  via  a 
Bayesian  analysis)  operates  by  searching  in  the  2- 
dimensional  8-space  to  find  those  values  whicli 
best  fit  tire  3S  data  points  observed  so  far.  The 
program  allows  this  search  to  be  started  at  two 
different  points:  the  two  results  should  be  quite 
close  as  a  check  on  convergence.  An  objective 
measure  of  the  quality  of  the  fit  of  the  growth 
function  is  given  by  the  value  of  the  goodness-of- 
fit  statistic,  h'STAT:  this  should  be  small.  Based 
on  estimates  of  the  parameters  utilising  the  35 
failure  times  so  far  observed,  the  program  then 
calculates  and  displays  percentage  points  of  the 
next  20  predicted  times-to- failure.  Ihus  19.t>5  is 
the  101  point,  57.11  the  251  point,  158.94  the  SOI 
point  (median),  etc.,  of  the  next  (36th)  time-to- 
lailure.  The  number  in  parentheses  gives  the 
position  on  the  predicted  distribution  of  the 
actual  observation:  it  is  these  numbers  which  are 
used  to  validate  the  model.  Thus  in  this  case,  the 
actual  36th  observation  (65  seconds)  lies  at  the 
0.276  (27.61)  point  of  the  distribution  predicted 
from  35  failure  times. 

The  calculations  have  been  performed  success¬ 
ively  upon  the  first  30,  35,  40,  ...  ,  120,  125 
observations  in  the  series.  Since  there  is  no 
a  priori  reason  for  choosing  a  particular  growth 
function,  the  calculations  have  all  been  carried 
out  on  botli 


<Ki)  =  e1  +  S2i2. 

The  measure  of  goodness -of- fit  (KSTAT  in  Fig.  1) 
is  used  to  choose  between  these  for  each  calculation. 
Table  2  shows  that  the  linear  growth  function  is 
superior  except  for  40,  45,  1C  5  obrervations. 

3,  The  results 

It  should  be  emphasised  that  the  objective  of 
these  calculations  was  to  measure  the  quality  of  the 
predicted  time-to- failure  distributions.  This  is 
a  more  stringent  test  than  procedures  such  as 
comparing  predicted  mttf's  with  average  times-to- 
failure.  In  fact,  a  secondary  objective  was  to  see 
whether  my  suspicion  of  the  non-existence  of  mttf 
for  software  could  be  tested  I 2,  31. 

It  can  easily  be  seen  that  if  the  model  is  a 
good  one,  the  numbers  in  parentheses  in  the  output 
(see  Fig.  1)  can  be  regarded  as  a  random  sample 
from  a  uniform  distribution  on  (0,  1).  This 
observation  underlies  the  results  shown  in  Figures 
2,  3,  4. 

Fig.  2  shows  the  quality  of  predictions  1  to  5 
steps  ahead  using  the  linear  growth  function.  Here 
100  actual  observations  are  compared  with  the  100 
predicted  distributions  based  on  calculations  using 
successively  30,  55,  ...  125  observations.  As  can 
be  seen  from  the  Kolmogorov- Smirnov  test ,  the  fit  is 
extraordinarily  good. 

Even  the  quadratic  growth  function,  which 


Table  2  shews  to  be  inferior  to  the  linear  one, 
gives  an  excellent  fit  (see  Fig.  3). 

In  Figure  4  a  test  is  carried  out  of  longer 
term  prediction.  Here  projections  are  made  relating 
to  the  distributions  16  to  20  failures  in  the 
future,  using  the  linear  growth  function.  The 
quality  of  these  results  is,  again,  remarkably 
good. 

It  is  particularly  noticeable,  in  Figures  2  and 
4,  that  the  fit  is  at  its  best  in  the  right  hand 
tail  of  the  distribution.  Since  it  is  precisely 
here  that  our  model  differs  from  the  simpler 
exponential  models,  by  having  long  right-hand  tails 
in  the  predicted  time-to-l'ailure  distributions 
(causing  infinite  mttf) ,  this  seems  evidence  in 
favour  of  the  controversial  allegations  of  my 
earlier  papers  .'.2,  31. 

4,  Conclusion 

Although  the  results  shown  here  relate  to  only 
one  set  of  data,  they  are  encouraging  in  the  support 
they  give  to  our  model.  In  fact  the  quality  of  fit 
between  predictions  and  observations  is  better  than 
I  expected  when  I  embarked  upon  this  validation 
exercise.  This  work  will  continue,  and  it  is  hoped 
to  analyse  more  data  shortly,  but  already  there  is 
the  beginnings  of  a  case  in  support  of  the  arguments 
given  in  references  [23  and  [33. 

the  program  which  performs  these  calculations 
is  still  being  developed,  and  it  is  hoped  to 
publish  a  version  of  it  eventually.  In  the  meantime, 
if  anyone  would  like  a  listing  of  the  current 
version,  please  let  me  know.  Alternatively,  I 
would  be  very  happy  to  have  the  opportunity  of 
analysing  any  software  failure  data  which  readers 
may  have. 
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Table  1.  The  136  successive  failure  times  upon 

which  the  calculations  are  based.  Notice 
the  great  variability  of  the  data,  and 
the  obvious  improvement  in  reliability 
as  time  passes. 


Table  2.  Comparison  of  the  two  growth  functions 
for  the  series  of  calculations.  It 
can  be  seen  that  the  linear  function 
provides  a  better  fit  except  in  the 
cases  of  calculations  based  upon  40, 
45,  and  105  observations. 
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Figure  1.  Example  of  output  from  the  program  which  performs  the  calculations 
described  in  the  text  of  the  paper.  WSTAT  is  the  value  of  the 
goodness-of-fit  statistic,  used  in  the  optimisation  routine,  at 
its  minimum:  i.e.  at  the  point  in  6-space  which  will  be  used  as 
the  estimate  in  the  predictions.  It  is  the  value  oi  this  statistic 
which  enables  a  comparison  to  be  made  between  different  growth 
functions:  the  best  growth  function  is  that  which  has  smallest 
WSTAT. 

The  percentiles  given  here  are,  from  left  to  right,  101,  25%, 

50%  (median),  75%,  90%.  The  last  column  shows  the  position  of 
each  actual  observation  on  the  distribution  which  is  predicted 
for  it  based,  here,  on  the  first  35  observations. 
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Figure  2.  lest  of  fit  between  actual  observations  and  predicted  distributions, 
using  a  linear  growth  function.  There  are  100  observations  used;  the 
prediction  is  1  to  5  steps  ahead. 


The  fit  is  extremely  good,  as  may  be  seen  from  the  K-S  value  of  .6872. 
Notice  particularly  the  closeness  of  the  points  in  the  right-hand 
portion  of  the  plot  to  the  line. 


uoliTORorov-Smimov  test : 
Dn-/n  “1.1  for  n  «  100 
P(Dn./n  >  1.1)  =  .1777 


.  -f-  ■■  »■  —  ■  »-■  ■  — »  -  —  » 

.3  .4  .:>  .6  .7  .8 


Quadratic  growth  function,  prediction  1  to  5  steps  ahead. 
Although  the  quadratic  does  not  fit  as  well  as  the  linear 
function  (see  Table  2),  the  predictions  are  again  very  good. 
The  K-S  value  is  much  smaller  than  for  the  linear  case, 
but  is  still  not  significant  at  the  101  level. 


Figure  4.  Linear  growth  function,  longer  range  prediction:  16  to  20  steps 

ahead.  This  time  only  90  observations  could  be  used  because  of  the 
length  of  the  data  vector.  Again  the  fit  is  extremely  good,  as  seen 
from  the  K-S  statistic.  Good  fit  in  right-hand  tail  of  distributions 
again. 
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PROGRESS  IN  SOFTWARE  RELIABILITY  MEASUREMENT 
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Bell  Telephone  Laboratories 
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ABSTRACT  Project  Development  Experience 


This  paper  summarizes  progress  made 
in  the  past  year  In  the  application  of 
the  execution  time  theory  [1]  of  software 
reliability.  It  also  discusses  a  con¬ 
tinuing  mutual  critical  examination  of 
the  Littlewood  model  [2]  and  the  author's 
execution  time  theory. 


Introduction 

Work  associated  with  the  execution 
time  theory  of  software  reliability  in 
the  last  year  falls  into  four  principal 
areas : 

(a)  collection  of  data  for  model 
verification, 

(b)  accumulation  of  experience  in  appli¬ 
cation  of  the  theory  to  project 
development,  with  resultant 
refinements, 

(c)  application  of  the  theory  to 
measurement  of  computation  center 
software,  involving  service  monitor¬ 
ing  and  change  control,  and 

(d)  investigation  of  execution  time 
.model  simulation  as  an  aid  to 
management  decisions. 

In  addition  to  the  foregoing  work, 
substantial  effort  was  expended  in  mutual 
critical  examination  of  the  Littlewood 
model  and  the  author's  execution  time 
theory,  resulting  in  disposition  of  some 
of  the  issues  that  had  been  raised  at  the 
first  3LCM  Workshop  and  sharpening  others. 

Data  Collection 

Failure  Interval  data  has  now  been 
collected  for  ten  software  development 
projects  and  four  operational  software 
systems.  Resource  expenditure  data  has 
been  collected  ‘’or  four  development  pro¬ 
jects.  This  data  is  being  analyzed  with 
regard  to  the  execution  time  model  of  soft¬ 
ware  reliability  and  its  assumptions  to 
determine  how  well  or  poorly  the  model  fits 
real  1 ty . 


The  execution  tine  theory  has  been 
and  is  currently  being  applied  to  various 
development  projects.  '"onsiderutle  exper¬ 
ience  lias  been  accrued  in  handling  parti¬ 
cular  circumstances  related  to  each  project 
[3].  The  accumulation  and  organization  of 
this  experience  is  continuing.  Some  of  the 
particular  problems  that  have  been 
addressed  are: 

(a)  handling  design  changes, 

(b)  specifics  of  parameter  estina* 3  on, 
and 

(c)  details  of  program  parameter 
reestimation . 

furrent  work  includes  investigation  ■  f 
methods  for  improving,  predictions  made  :’<  r 
projects  that  employ  sequential  Integration 
and  testing. 

Mea su remon_t  of  Computation 
Gent. er  Softwa re 

The  execution  time  theory  was  recently 
applied  to  operating  system  software  in  a 
large  general  purpose  computation  center 
over  a  period  of  15  months  [  A  ] .  Mean¬ 
time-  to- fa  I lu re  (MTTF)  trends  of  software 
components  were  shown  to  correlate  very 
well  with  periods  of  system  degradation 
or  improvement.  The  trends  can  he  moni¬ 
tored  as  a  basis  for  controlling  oyster, 
modification  (modifications  are  suspended 
when  the  MTTF  is  below  an  acceptable 
level),  reassigning  maintenance  personnel 
among  software  components,  and  determining 
when  to  make  cutovers  to  new  release:;  or 
regressions  to  old  releases. 

P 1  miilat. ton  as  an _ Aid_t_<> 

Management  Jfieo  is i  e-ir. 

Seme  experiment  at  ion  was  conducted 
with  simulation  of  the  exocul  ion  1  I  me  model 
as  an  aid  in  making  management  decisions 
I’aramet.ers  of  the  model  that,  were 
associated  with  different  manager  in  1 
options  were  so  looted  ,  and  r,  1-iulat  ion  was. 
used  to  determine  the  cfTccts.  on  scheduler, 
and  cost,:;. 
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For  example,  for  one  project,  the  pre¬ 
dicted  i\'B|'Ier  ion  date  w  as  eons  l<lo  rod  to 
be  too  late  and  simulations  were  run  to 
do  tern:  l  no  the  effects  of  adding  various 
resource!!  to  the  project  and  of  reducing 
the  MIT-'  objective.  For  this  particular 
project  at  t  tie  particular  point  i  ri  time 
studied,  mlding  people  or  working  overtime 
were  found  l  o  have  no  effect-.  Providing 
more  compute!'  resources  did  permit  the 


reflected  in  the  distribution  of  tine 
n-tti  failure  from  the  present. ,  whose  j  . -1  ■ 
cent  lies  can  be  estimated.  In  tin-  auth. 
model,  the  uncertainty  is  reflected  in 
confidence  intervals  estimated  f,.r  the 
subparamo’ters  .  hit  t  lewoed  '  s  sut-i  arani'-t  o 
are  estimated  in  t.herase  1  ves  but.  eonflueti 
intervals  are  riot,  determined  for  them. 
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.Ittlewood  and  Execution  Time  Models 


».  i.: 

been  erica: 
dialogue  i 
re  1  lab  t  1  Jtj 
si i mu  luted 
software  ta. 
I. 1 1  Lewood 
after  scene 
'Vi.  The  : 


tt lewoed 
;ed  in  a 
oncern  ing 
mode  1 s . 
bv  a  more 
1  lab  i  1  it.y 
[  t-  ] .  Th  : 
of  T. .he  d! 
b; 1 ow i ng 


and  the  author'  have 
ont  I  truing  critical 
■cacti  other’s  software 
This  was  initially 
general  critique  of 
models  published  by 
critique  was  updated 
cuss  ions  took  place 
s  the  author'.-,  under- 


•’s 1 1  mat iori  to  determine  the  subpar.amet: ei' 
values  and  their  confidence  intervals. 

I  it  tl  "wood's  approach  is  somewhat  more 
general,  at  a  cost  of  one  or  two  orders  ■ 
magnitude  greater  computing  requirements 
The  author  questions  whether  the  added 
generality  Is  worth  the  cost;  the  auth’r 
approach  may  well  be  sufficiently  accura' 
for  most  practical  purposes . 

The  generality  of  ;,i t  1 1  e-wood  ' a 


:! t  and i  ng  of  Uie  current  status  if  this 
d ia ! ague. 

Many  issues,  have  been  resolved  or  have 
vanished  as  a  result,  of  clearer  under. 't  'Hid¬ 
ing  t  *  his  does  not  mean  that  these  same 
issues  may  not  exist  for  other  software 
reliability  models).  Both,  models  are 
oriented  toward  failures  and  operational 
re l  lability.  We  agree  that  it  is  important 
ti  be  able  to  estimate  the  number  of  errors 
because  this  -quantity  is  related  to  the 
repair  effort  that  will  be  required  and 
lienee,  i  reject  schedules.  We  are  both  in 
agreement  that  execution  time  rather  than 
calendar  time  is  the  key  metric  related  to 
so  ft  wa  ia  -  re  1  lab  I  11  ty .  Bet  h  mode  1  s  as  sumo 
an  exponential  distribution  of  failure 
intervals . 

In  the  ! ittlewood  model,  the  failure 
rate  parameter  of  the  exponential  distribu¬ 
tion  is  assumed  to  have  a  gamma  distribu¬ 
tion,  one  of  whose  parameters  accounts  for 
reliability  growth.  The  growth  parameter 
is  some  function  (usually  in  two  subparame¬ 
ters)  of  the  sequential  failure  number. 

Iri  the  author's  model,  the  exponential 
distribution  failure  rate  parameter  is  a 
fixed  (essentially  two  subparameter)  func¬ 
tion  of  the  sequent,  lal  failure  number.  The 
two  subparameters,  initial  MTTF  and  total 
failures  expected,  have  readily  understood 
phyu  teal  interpretat ions . 

Both  models  view  the  repair  process 
as  being  statistical  In  nature.  In  the 
(.ittlewood  model,  this  uncertainty  is 
exhibited  in  the  exponential  distribution 
failure  rate  parameter  1  t.sel  f  having  a 
distribution.  This  distribution  can  be 


apt  roach  quickly  leads  to  analytical  d i f- 
ficultles;  it.  is  riot  computationally 
tract  1  cal  to  predict  the  execution  tint" 
required  to  reach  a  specified  failure 
interval  >ti,i  e- 1.  i  ve .  it  therefore  is  not 
!  osslbl"  to  predict  pr-ject  comp  let  lot. 
date,  which  is-  of  vital  Importance  to 
managers.  The  author' s  ap| roach  dees 
yield  T h e s o  predictions. 


r-tt  tlowood  and  the  author  differ-  e 
fundamental  point,  which  perhaps  is  base 
ori  the  char.'ict  or*  st  Irs  of  a  mat:  hemat.  i  ca 
statistical  versus  an  engineering  apprt 
to  the  problem  (i.e.,  focusing  on  rigor 
versus  focusing  •  n  utility).  hi 1. 1, lower 
points  out  that,  the  author's  model  is 
dependent  (due  to  the  implicit  assutr.pt  1 
that,  the  MTTF  exists)  on  the  postulate 
it  is  certain  that,  a  program  is  impend 
or  will  eventually  fail.  Based  on  ext i 
ence  as  a  software  manager  and  on  writ  1 
many  programs ,  the  author  believes  that, 
this  is  an  excellent  model  of  reality. 
The  rare  cases  when  this  postulate  may 
to  true  do  not  Justify  the  added  eomple 
and  the  added  computation  that  the 
! Ittlewood  mode]  involve. 
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Although  the  author  prefers  (lie 
execution  t  l  me  model  t  <-caur.«  h>-  !'•■( -i  s  : : 
is,  a  more  understandable,  useful,  and 
economic  engineering  and  nanageri a ]  t>, 
the  ;  'Mdewood  model's  somewhat  greater 
generality  and  rigor  are  at  t  n  t  i ve .  Vn 
sequent ly ,  studies  are  in  progress  to 
at  tempt,  to  relate  tin-  two  models  .  : ! .  j  .  — 
fully,  tile  Ml  1  lewoied  mode  I  can  tie  lesti  i 
on  data  from  projects  where  the  author's, 
mode l  was  applied.  At  t.emt  '  s  are  being 
made  by  belli  hit!  lowed  and  the  author  to 
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speed  up  the  minimization  process  the 
Littlewood  model  uses.  Perhaps  these 
studies  will  result  in  refinements  to 
both  models.  At  any  rate,  their  similar¬ 
ities  and  differences  and  advantages  and 
disadvantages  should  come  into  sharper 
focus . 
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The  Work  Breakdown  Structure  (WBS)  is  a 
vehicle  for  breaking  an  engineering  project  down 
into  subproject,  tasks,  subtasks,  work  packages, 
etc.  It  is  an  important  planning  tool  which  links 
objectives  with  resources  and  activities  in  a 
logical  framework.  It  becomes  an  important  status 
monitor  during  the  actual  implementation  as  the 
completions  of  subtasks  are  measured  against  the 
project  plan.  Whereas  the  WBS  has  been  widely 
used  in  many  other  engineering  applications,  it 
has  seemingly  only  rarely  been  formally  applied 
to  software  projects  for  various  reasons.  Recent 
successes  with  software  project  WBSs,  however, 
have  clearly  indicated  that  the  technique  can  be 
applied,  and  have  shown  the  benefits  of  such  a 
tool  in  management  of  these  projects. 

This  paper  advocates  and  summarizes  the  use 
of  the  WBS  in  software  implementation  projects. 

The  paper  also  identifies  some  of  the  problems 
people  have  had  generating  software  WBSs,  and  the 
need  for  standard  checklists  of  items  to  be 
included. 

Introduction 

If  one  were  to  be  given  the  task  o *  writing  a 
program  in  which  the  target  language  instruction 
set  was  not  entended  to  be  executed  by  some  dumb 
computer,  but,  instead,  by  intelligent  human  be¬ 
ings,  then  that  programmer  might  be  thought  to 
have  an  easier  job  than  his  colleagues  who  write 
their  programs  for  machines.  However,  a  little 
reflection  will  show  that  his  job  is  much  more  dif¬ 
ficult  for  a  number  of  reasons,  among  which  are 
ambiguities  in  the  English  language  and  a  multitude 
of  human  factors.1  However,  such  a  program,  often 
named  the  PLAN,  is  an  essential  part  of  almost 
every  Industrial  project  slated  for  success. 

One  of  the  difficulties  in  writing  this  pro¬ 
gram  is  the  supplying  of  enough  detail  so  as  to  be 
executable  without  ambiguity  by  those  programm’d. 
Another  is  getting  the  right  controls  into  the 
program  so  that  the  programees  perform  as  stat’d 
in  the  PLAN.  Still  another  is  making  the  plan 
complete,  having  all  contingencies  covered  and  a 
prtper  response  to  each  supplied.  And  one  final 
prcblem  of  note  here  is  making  the  plan  bug-free, 
or  rel i able,  so  that  once  execution  starts,  if 

*The  work  reported  in  this  paper  was  carried  out 
at  the  .let  Propulsion  Laboratory  of  the  California 
Institute  of  Technology  under  contract  NAS  7-100, 
sponsored  by  the  National  Aeronautics  and  Space 
Administration. 


everything  proceeds  according  to  the  PLAN,  there 
is  no  need  to  deviate. 

Programmers  well -schooled  in  modern  tech¬ 
niques2  would  approach  the  writing  of  this  PLAN  in 
a  structured  way,  using  top-down  design  methodol- 
ology  modular  development,  stepwise  refinement, 
hierarchic  layering  of  detail,  structurally  sound 
constructions,  and  semantically  definite  document¬ 
ation.  Such  an  approach  would  tend  to  bring  a 
measure  of  organization  to  the  PLAN,  understand- 
ability  to  its  documentation,  and  reliability  to 
its  execution.  If  created  in  this  way,  the  result¬ 
ing  format  of  the  PLAN  work  tasks  would  have  the 
attributes  of  what  is  known  in  the  engineering 
industry  as  a  "Work  Breakdown  Structure". 3 

The  Work  Breakdown  Structure  (WBS)  is  an 
enumeration  of  all  work  activities  in  hierarchic 
refinements  of  detail,  which  orgainzes  work  to  be 
done  into  short,  manageable  tasks  with  quantifiable 
inputs,  outputs,  schedules,  and  assigned  responsi¬ 
bilities.  It  may  be  used  for  project  budgeting  of 
time  and  resources  down  to  the  individual  task  level, 
and,  later,  as  a  basis  for  progress  reporting 
relative  to  meaningful  management  milestones.  A 
software  management  plan  based  on  a  WBS  contains 
the  necessary  tools  to  estimate  costs  and  sched¬ 
ules  accurately  and  to  provide  visibility  and 
control  during  production. 

Such  a  plan  may  be  structured  to  evaluate 
technical  accomplishments  on  the  basis  of  task  and 

activity  progress.  Schedules  and  PERT/CPM^  net¬ 
works  may  be  built  upon  technical  activities  in 
terms  of  task  milestones  (i.e.,  accomplishments, 
outputs,  and  other  quantifiable  work  elements). 
Projected  versus  actual  task  progress  can  be  re¬ 
viewed  by  technical  audit  and  by  progress  reviews 
on  a  regular  (say  monthly  or  bi-weekly)  basis. 

Formal  Project  Design  Reviews  are  major  check 
points  in  this  measurement  system. 

But  knowing  modern  programming  theory  does 
little  good  if  one  does  not  also  have  the  pro¬ 
gramming  experience  to  apply  it  to.  Similarly, 
the  knowledge  of  what  a  WBS  is,  what  its  goals  are, 
what  its  benefits  are,  and  what  its  structure  is 
supposed  to  be  like,  does  not  necessarily  instruct 
one  in  how  to  apply  that  knowledge  toward  develop¬ 
ing  a  WBS  for  his  or  her  own  project. 

In  the  coming  sections  of  this  paper,  I  shall 
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review  some  of  the  characteristics  and  benefits  of 
the  WBS,  and  then  discuss  how  these  can  b,_  develop¬ 
ed  and  applied  in  software  implementation  projects. 

I  will  orient  this  material  principally  toward 
new-software  production  tasks,  although  many  of  the 
concepts  will  be  applicable  also  to  continuing 
maintenance  and  operations  tasks,  as  well. 

The  Work  Breakdown  Structure 

The  goals  assumed  here  for  generating  the  WBS 
are  to  identify  work  tasks,  needed  resources,  im¬ 
plementation  constraints,  etc.  to  that  level  of  de¬ 
tail  which  yields  a  sought-for  accuracy  in  the  orig¬ 
inal  PLAN,  and  to  provide  the  means  for  early  cal¬ 
ibration  of  this  accuracy  and  corrective  replan¬ 
ning,  if  required,  during  the  actual  implement¬ 
ation. 

How  refined  should  this  WBS  be?  Let  me  answer 
this  question  by  showing  how  the  WBS  and  schedule 
projection  accuracy  are  interrelated. 

If  a  project  has  identified  a  certain  number 
of  equi-effort  milestones  to  be  achieved  during 
the  course  of  implementation,  then  the  mere  number 
of  milestones  achieved  by  a  certain  date  is  an 
indicator  of  the  progress  toward  that  goal.  A 
graph  of  accumulated  milestones  as  a  function  of 
time,  sometimes  called  a  "rate  chart",  permits 
certain  predictions  to  be  made  about  the  future 
completion  date  rather  handily  and  with  quanti¬ 
fiable  accuracy,  especially  if  the  milestones  are 
chosen  properly. 

Let  it  be  supposed  that  it  is  known  a  priori, 
as  a  result  of  generating  the  WBS,  that  a  project 
will  be  completed  after  M  milestones  have  been  met. 
These  milestones  correspond  to  all  the  tasks  which 
have  to  bp  accomplished,  and  once  accomplished,  are 
accomplished  forever  (i.e.,  some  later  activity 
does  not  re-open  an  already  completed  task;  if 
such  is  the  case,  it  can  be  accommodated  by  making 
M  larger  to  include  all  such  milestones  as  separate 
events).  The  number  M,  of  course,  may  not  be  pre¬ 
cisely  known  from  the  first,  and  any  uncertainty 
in  M  is  certainly  going  to  affect  the  estimated 
completion  date  accuracy.  Such  uncertainties  can 
be  factored  in  as  secondary  effects  later,  as 
needed  for  refinement  of  accuracy. 

Now,  let  it  be  further  supposed  that  it  has 
been  possible  to  refine  the  overall  task  into 
these  M  milestones  in  such  a  way  that  each  task  is 
believed  to  require  about  the  same  amount  of  effort 
and  duration  to  accomplish.  Viewed  at  regular 
intervals  (fc.g.,  bi-weekly  or  monthly),  a  plot  of 
the  cumulative  numbers  of  milestones  reported  as 
having  been  completed  should  rise  linearly^  until 
project  completion. 

More  quantitatively,  let  m  be  the  average 
number  of  tasks  actually  completed  during  each 
reporting  period,  and  let  o  be  the  standard  de¬ 
viation  of  the  actual  number  of  milestones  com¬ 
pleted  each  reporting  period  about  the  mean  vaiue 
(the  values  of  m  and  o  are  presumed  to  be  constant 


over  the  project  duration).  The  value  of  m  is  a. 
reflection  of  the  team  average  productivity  and 
is  a  measure  of  the  ability  estimate  their  pro¬ 
duction  rate.  Both  are  attestations  to  team  ef¬ 
fectiveness  --  first,  in  their  ability  to  produce, 
and  second,  in  the  ability  to  create  a  work  plan 
which  adequately  accounts  for  their  time. 

The  project  should  require  M/m  reporting  pe¬ 
riods  to  complete,  which  time,  of  course,  should 
not  depend  on  whether  a  WBS  was  made  or  not  (1  am 
discounting,  in  this  discussion,  whether  WBS 
generation  increases  or  decreases  productivity). 
Thus,  M/m  should  be  a  constant  value,  relatively 
speaking.  If  M  is  made  large,  tasks  are  smaller 
and  shorter,  so  proportionately  more  of  them  are 
completed  each  reporting  period.  The  project 
schedule  will,  in  fact,  assume  some  productivity, 
or  mean  accomplishment  rate,  but  an  actual  per¬ 
formance  value  will  generally  be  unknown  until 
progress  can  be  monitored  for  some  period  of  time. 

But  while  the  numbers  M  and  o  may  not  affect 
the  team  productivity,  they  do  directly  influence 
the  effectiveness  with  which  a  project  can  monitor 
its  progress  and  predict  its  future  accomplish¬ 
ments.  Generation  of  a  WBS,  of  course,  gives  (or 
estimates)  the  parameter  M.  Monitoring  the  com¬ 
pletion  of  milestones  provides  estimates  for  m  and 
o.  From  these,  projections  of  the  end  date  and 
calculations  for  the  accuracy  of  this  prediction 
can  be  made.  Based  on  such  information,  the  pro¬ 
ject  can  then  divert  or  reallocate  resources  to 
take  corrective  action,  should  progress  not  be 
deemed  suitable. 

In  this  simplified  model,  a  best  straight- 
line  fit  through  the  cumulative  milestone  progress 
over  the  first  r  reports  (of  an  expected  R=M/m 
reports)  at  regular  aT  intervals  will  predict  the 
time  required  to  reach  the  final  milestone.  It 
will  also  provide  an  estimate  of  m  and  o.  The 
normalized  predicted  completion  date  may  be  ex¬ 
pected  to  deviate  from  the  projected  value  (as  a 
one-sigma  event)  by  no  more  than-1 
cM;  1.48  .-j  (R/rM)‘* 

within  first-order  effects.  The  value  oj  =  o/ttw 
represents  the  normalized  variance  of  an  individual 
task  milestone  (it  is  limited  to  values  of  less 
than  unity  in  the  underlying  model). 

The  bound  permits  the  specification  of  WBS 
characteristics  which  enable  accurate  early  pre¬ 
dictions  of  future  progress.  High  overall  accuracy 
depends  on  a  combination  of  low  o  and  large  M. 

One  may  compensate  for  inaccurate  appraisals  of 
productivity  only  by  generating  a  very  detailed 
WBS! 

As  an  example,  suppose  that  a  101  end-date 
prediction  accuracy  is  required  by  the  end  of  the 
first  quarter  (r/R  =  .25)  of  a  project.  Then  the 
tradeoff  figure  is  M/o j2  =  876.  Hence,  if  the 
WBS  is  highly  uncertain  (o»l),  that  WBS  should 
contain  876  equi-duration  milestones!  If  the 
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project  is  confident  that  it  can  hold  more  closely 
to  its  average  productivity  (and  has  most  contin¬ 
gencies  provided  for),  with  a  aj-0.5,  then  it  needs 
only  about  220  milestones.  A  one-man-year  project 
with  bi-weekly  reporting,  one  milestone  per  report 
{26  milestones  in  all)  must  demonstrate  a  oj=0.17 
level  of  task  prediction  accuracy! 


It  is  therefore  both  necessary  and  important 
to  generate  a  detailed  WBS  rather  carefully,  and 
to  monitor  milestone  achievements  relative  to  this 
WBS  very  faithfully,  if  accuracy  in  predicting  the 
future  progress  of  a  project  is  of  great  impor¬ 
tance. 

Reaso nable  Schedule  Accuracy 

A  project  engineer  on  a  2-year,  10-man  task 
may  perhaps  be  able  to  manage  as  many  as  876  sub¬ 
tasks,  each  formally  assigned  and  reported  on. 

That  amounts  to  about  one  subtask  completion  per 
week  from  each  of  the  9  workers.  But  the  genera¬ 
tion  of  the  descriptions  for  the  876  tasks  will 
require  considerable  effort.  Moreover,  it  is  un¬ 
likely  that  such  a  detailed  plan  would  have  a  a 
also  as  large  as  one  week;  if  the  project  engineer 
has  the  ability  to  break  the  work  accurately  into 
876  week-long  subtasks,  he  or  she  can  probably 
also  estimate  the  task  deviations  well  within  a 
week. 

The  ability  of  the  project  engineer  (or  plan¬ 
ning  staff)  to  generate  a  clear  and  accurate  WBS 
will  determine  the  level  to  which  the  WBS  must  be 
taken.  Greater  accuracy  of  the  work  breakdown 
definition  produces  greater  understanding  and 
clarity  of  the  actions  necessary  to  complete  task 
objectives.  If  the  work  is  understood,  readily 
identified,  and  achievable  as  discerned,  the  con¬ 
fidence  of  reaching  the  objectives  is  high.  Thus, 
the  further  the  subtask  descriptions  become  re¬ 
fined,  the  better  the  estimator  is  able  to  assess 
the  individual  subtask  durations  and  uncertainties. 

Refinement  ceases  when  the  sought-for  M/oM^  is 
reached. 

Practically  speaking,  a  work-plan  witn  tasks 
shorter  than  one  week  in  duration  will  usually 
require  too  much  planning  and  management  overhead 
to  be  worthwhile.  On  the  other  hand,  a  work  plan 
with  tasks  longer  than  one  or  two  weeks  will  prob¬ 
ably  suffer  from  a  large  oj.  Thus,  a  breakdown 
into  one-  or  two-week  subtasks  is  probably  the 
most  reasonable  target  for  planning  purposes. 

A  work-year  consists  of  about  47  actual  weeks 
of  work  (excluding  vacation,  holidays,  sick-leave, 
etc).  Therefore,  a  project  of  w  workers  can  rea¬ 
sonably  accommodate  only  about  47w/d  tasks  per 
year  (including  management  tasks)  of  duration  d 
weeks  each;  spread  over  y  years,  the  total  number 
of  milestones  can  reach  M=47wy/d,  so  that  the 
practical  accuracy  limit  one  may  reasonably  ex¬ 
pect  at  the  one-quarter  point  in  a  project 
(r/R=.25)  is  about 


oM  s  0.432  o  j(d/wy)t2 

Note  that  accuracy  is  related  to  the  total  man- 
year  effort  in  a  project,  other  things  being  equal. 
A  3-man-year  project  completing  1  task  per  manweek 

can  expect  to  have  oM  <  0.216  oj;  with  a  o,=0.4 

(*2  days  per  weekly  task)  The  end-date  estimation 
accuracy  is  within  10%. 

Generating  The  WBS 

There  is  no  mystery  about  making  a  WBS. 

People  do  it  all  the  time,  although  they  seldom 
call  the  result  a  WBS.  Most  of  the  things  we  do, 
in  fact,  are  probably  first  organized  in  our  heads, 
and  for  small  undertakings,  most  of  the  time  that 
works  out  well.  For  more  complex  undertakings, 
especially  those  involving  other  people,  it  be¬ 
comes  necessary  to  plan,  organize,  document,  and 
review  more  formally. 

The  general  algorithm  for  generating  a  WBS  is 
even  fairly  simple  to  state.  It  goes  something 
like  this: 

1.  Start  with  the  project  statement  of  work, 
and  put  this  TASK  on  top  of  the  "working 
stack”. 

2.  Consider  the  TASK  at  the  top  of  the  work¬ 
ing  stack.  Define  technical  performance 
objectives,  end- i tern  objectives ,  reli¬ 
ability  and  quality  objectives,  schedule 
constraints,  and  other  factors,  as  ap¬ 
propriate;  inputs  and  materials  required 
for  starting  the  task;  accomplishments 
and  outputs  which  signal  the  completion  of 
the  task;  known  precedent  tasks  or  mile¬ 
stones;  known  interfacing  tasks;  and  re¬ 
sources  required,  if  known.  Determine 
whether  this  task  can  be  accomplished 
within  the  duration  (or  cost)  accuracy 
goal. 

3.  If  the  goal  is  achieved,  skip  to  the  next 
step;  otherwise,  partition  the  current 
TASK  into  a  small  number  of  comprehensive 
component  subtasks.  Include  interfacing 
tasks  and  tasks  whose  output  is  a  decision 
regarding  substructuring  of  other  sub¬ 
tasks.  Mark  the  current  TASK  as  a  "mile¬ 
stone",  pull  its  description  off  the  work¬ 
ing  stack,  push  it  onto  the  “finished 
stack",  and  push  each  of  the  subtask  des¬ 
criptions  onto  the  working  stack. 

4.  Repeat  from  step  2  until  the  working  stack 
is  empty. 

5.  Sequence  through  all  items  from  the  “fin¬ 
ished"  stack  and  accumulate  durations 
(costs)  into  the  proper  milestones. 

The  steps  in  this  algorithm  are  not  always 
simple  to  perform,  nor  can  they  always  be  done  cor¬ 
rectly  the  first  time,  nor  without  sometimes  refer- 
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ring  to  items  already  put  into  the  "finished"  list. 
The  process  is  one  of  creation,  and  thus  it  re¬ 
quires  judgement,  experience,  identification  of 
alternatives,  tradeoffs,  decisions,  and  iteration. 
For  as  the  project  statement  of  work  is  refined, 
eventually  the  implemental tion  of  the  program  it¬ 
self  appears  as  one  of  the  subtasks  to  be  refined. 
When  this  subtask  is  detailed  into  component  parts, 
the  work  descriptions  begin  to  follow  the  influ¬ 
ences  of  the  program  architecture,  organizational 
matters,  chronological  constraints,  work  locations, 
and  "whatever  makes  sense". 

Therefore,  the  formation  of  the  WBS,  the 
detailed  planning,  and  the  architectural  design 
activity  are  all  mutually  supportive.  The  archi¬ 
tecture  indicates  how  to  structure  the  tasks,  and 
the  WBS  goals  tell  when  the  architectural  phase 
of  activity  has  proceeded  far  enough.  Scheduling 
makes  use  of  the  WBS  as  a  tool  and  in  turn  influ¬ 
ences  the  WBS  generation  by  resolving  resource 
conflicts. 

But  there  are  many  subtasks  in  a  software 
project  which  are  not  connected  with  the  architec¬ 
ture  directly,  such  as  requirements  analysis,  pro¬ 
ject  administration  and  management,  and  prepara¬ 
tions  for  demonstration  and  delivery.  The  struc¬ 
ture  of  these  subtasks,  being  independent  of  the 
program  architecture,  can  be  made  fairly  standard 
within  a  given  organization  for  all  software  pro¬ 
ductions.  However,  since  there  is  no  automatic  or 
closed-loop  means  to  guarantee  whether  all  the 
planning  factors  that  need  to  be  put  into  the  WBS 
actually  get  put  into  it,  a  standard  UBS  checklist 
can  be  a  significant  boon  to  proper  software  pro¬ 
ject  planning,  to  decrease  the  likelihood  of  some¬ 
thing  "dropping  through  the  cracks". 

A  Standard  WBS  Checklist 

Previous  DSN  experience^  at  The  Jet  Propul¬ 
sion  Laboratory  with  WBS  methodology  has  permitted 
moderately  large  software  implementation  projects 
to  detect  schedule  maladies  and  to  control  project 
completions  within  about  6 %  of  originally  scheduled 
dates  and  costs.  The  WBSs  were  formed  by  indi¬ 
viduals  with  extensive  software  experience,  over¬ 
seen  by  an  expert  manager.  None  of  the  software 
individuals  had  ever  made  a  WBS  before,  and  the 
manager  had  never  tried  one  on  a  software  project. 
Together,  with  much  travail,  they  assembled  ad  hoc 
items  into  a  workable  system. 

A  candidate  standard  WBS  outline  and  checklist 
is  currently  being  assembled  and  evaluated  within 
The  Deep  Space  Network  (DSN)  at  The  Jet  Propulsion 
Laboratory.  This  Standard  WBS  checklist  includes 
many  factors  gained  from  previous  successes  and 
contains  items  to  avert  some  of  the  identified 
shortcomings.  Table  I  shows  the  upper-level  struc¬ 
ture  of  this  WBS  checklist.  Oetailed  task  descrip¬ 
tions  are  also  in  process  of  documentation  and 
evaluation.  A  short  application  guidebook  is  plan¬ 
ned,  to  instruct  cognizant  individuals  in  the  meth¬ 
od,  approach,  and  practice. 


useful  automated  WBS  entry,  update,  processing, 
and  report  generation  aids  impose  standards  on 
software  projects  that  are  intended  to  facilitate 
the  project  management  activity  and  make  it  more 
effective.  Initial  scheduling  and  downstream  re¬ 
scheduling  of  subtasks  are  aided  by  a  WBS  database 
that  contains  precedence  relationships,  durations, 
costs,  resource  requirements,  resource  availabili¬ 
ty  and  similar  constraints  on  each  subtask.  PERT 
and  critical-path  methods  (CPM)  are  applied  direct¬ 
ly  to  the  WBS  database,  resulting  in  a  preliminary 
schedule.  Alterations  of  this  schedule  are  then 
effected  by  editing  the  WBS  via  additional  con¬ 
straints  recorded  into  the  database.  Actual  pro¬ 
duction  progress  is  measured  by  marking  milestone 
completions,  which  are  then  plotted  into  a  rate 
chart  and  all  significant  milestones  are  pro¬ 
jected  to  a  best-estimate  completion  date. 


Problems 

The  Work  Breakdown  Structure  is  a  well  known, 
effective  project  engineering  tool.  It  has  not 
been  applied  to  software  projects  as  often  as  it 
has  to  hardware  and  construction,  probably  because 
the  planning  and  architectural  design  tasks  in 
software  have  not  always  been  sufficiently  inte¬ 
grated  as  to  be  mutually  supportive,  because  all 
of  the  management,  support,  and  miscellaneous 
tasks  were  seldom  fully  identifiable  and  detail- 
able  during  the  planning  phase,  because  separation 
of  work  into  manageable  packets  quite  often  re¬ 
quires  design  decisions  properly  a  part  of  the 
detailed  design  phase,  because  a  basis  for  esti¬ 
mating  subtask  durations,  costs,  and  other  con¬ 
straints  has  not  existed  or  been  known;  and  be¬ 
cause  software  managers  have  not  been  trained  in 
WBS  methodology.  Modern  software  engineering 
studies  of  phenomenology  and  methodology  are  be¬ 
ginning  to  close  the  gaps,  however. 

The  existence  of  useful  tools  and  methods 
does  not  assure  their  acceptance,  nor  does  their 
acceptance  insure  project  success.  Plans  and 
controls  are  essential  project  aids,  but  unfor¬ 
tunately,  they  also  do  not  guarantee  success. 

The  WBS  is  a  planning,  monitor,  and  control  tool 
whose  potential  for  successful  application  within 
a  software  project  has  been  demonstrated.  How¬ 
ever,  further  researches  and  demonstrations  are 
necessary  before  a  WBS-oriented  software  plan¬ 
ning  and  control  methodology  and  system  are  as 
well  integrated  into  the  software  industry  as 
structured  programming  has  only  recently  become. 
Fortunately,  many  organizations  and  individuals 
are  sensitive  enough  to  the  software  management 
crisis  of  past  years  that  inroads  are  being  worked 


Happily,  the  solutions  will  almost  certainiy 
not  be  unique,  but  will  range  over  limits  which 
accommodate  management  and  programming  styles, 
organizational  structures,  levels  of  skill,  areas 
of  expertise,  cost  and  need-date  constraints,  and 
human  and  technical  factors. 


Such  a  checklist  and  guidebook,  together  with 
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Table _ | 


SOFTWARE  IMPLEMENTATION  PROJECT 
Detailed  Work  Breakdown  Structure 
Outline 


1.  ANALYZE  SOFTWARE  REQUIREMENTS 

.1  Understand  functional  and  software  requirements 

.2  Identify  missing  vague,  ambiguous,  and  conflicting  requirements 

.3  Clarify  stated  requirements 

.4  Verify  that  stated  requirements  fulfill  requestor's  goals 
.5  Assess  technology  for  supplying  required  software 
.6  Propose  alternate  requirements  or  capability 
.  /  Document  revised  requirements 

2.  DEVELOP  SOFTWARE  ARCHITECTURE 

.1  Determine  architectural  approach 

.2  Develop  external  functional  architecture 

.3  Develop  software  internal  architecture 

.4  Assess  architected  solution  vs.  requirements 

.5  Revise  architecture  and/or  renegotiate  requirements 

.6  Document  architecture  and/or  changed  requirements 

3.  DEVELOP  EXTERNAL  FUNCTIONAL  SPECIFICATION 

.1  Define  functional  specification  standards  and  conventions 
.2  Formalize  external  er.ironment  and  interface  specifications 

.3  Refine,  formalize,  and  document  the  architected  external  operational  view  of  t'.e  software 
.4  Define  functional  acceptance  tests 

.5  Verify  compliance  of  the  external  view  with  requirements 

4.  PRODUCE  AND  DELIVER  SOFTWARE  ITEMS 

.1  Define  programming,  test  and  verification,  QA,  and  documentation  standards  and  conventions 
.2  Formalize  internal  environment  and  interface  specifications 
.3  Obtain  support  tools 

.4  Refine  and  formalize  the  internal  design 

.5  Define  testing  specifications  to  demonstrate  required  performance 
.6  Define  QA  specifications 
.7  Code  and  check  the  program 
.8  Demonstrate  acceptabi I i ty  and  deliver  software 

5.  PREPARE  FOR  SOFTWARE  SUSTAINING  AND  OPERATIONS 

.1  Train  cognizant  sustaining  and  maintenance  personnel 

.2  Train  cognizant  operations  personnel 

.3  Deliver  sustaining  tools  aria  materials 

.4  Deliver  all  software  and  data  deliverables  tc  operations 

.5  Install  the  software  and  data  into  its  operational  environment 

.6  Prepare  consulting  agreement  between  implementation  and  operations 

6.  PERFORM  PROJECT  MANAGEMENT  FUNCTIONS 

.1  Define  project  goals  and  objectives 
.2  Scope  and  plan  the  project 
.3  Administrate  the  implementation 
.4  Evaluate  performance  at  roduct 
.5  Terminate  the  project 
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Abstract 

The  paper  discusses  the  current  status  of  the 
Software  Engineering  Laboratory.  Data  is  being 
collected  and  processed  during  the  development  of 
several  NASA/Goddard  Space  Flight  Center  ground 
support  projects.  The  data  is  used  to  evaluate 
software  development  disciplines  and  various 
models  and  measures  of  the  software  development 
process.  Emphasis  is  placed  upon  models  of  re¬ 
source  estimation,  the  analysis  of  error  and 
change  data,  and  program  complexity  measures. 


The  Software  Engineering  Laboratory  is  a  research 
project  between  NASA/Goddard  Space  Flight  Center 
and  the  Department  of  Computer  Science  of  the 
University  of  Maryland.  Ground  support  software, 
in  the  six  to  twelve  man-year  range,  developed  for 
the  Systems  Development  Section  of  NASA,  is  studied 
in  detail  for  determining  the  dynamics  of  software 
development  and  the  effects  of  various  features 
and  methodologies  on  this  development  [Basil!  and 
Zelkowitz  77].  Most  data  is  collected  in  a  set  of 
reporting  forms  that  are  either  filled  out  period¬ 
ically  by  all  project  personnel  (e.g.,  a  weekly 
Component  Status  report)  or  whenever  certain  events 
occur  (e.g.,  a  Change  Report  Form  when  an  error  is 
corrected).  This  report  describes  the  activities 
of  the  laboratory  for  the  last  twelve  months. 

The  initial  goal  of  the  Software  Engineering  Lab¬ 
oratory  was  the  collection  of  valid  data  and  the 
entering  of  this  data  into  a  computerized  data 
base.  During  the  last  twelve  months,  this  process 
has  been  implemented  and  the  analysis  of  the  data 
has  begun.  This  report  will  be  divided  into  four 
sections  briefly  outlining  each  of  the  major  ac¬ 
tivities  undertaken  by  the  laboratory:  (1)  Data 
Collection  Activities,  (2)  Resource  Estimation, 

(3)  Error  Analysis,  and  (4)  Program  Complexity. 

Data  Collection  Activities 


The  first  task  of  the  laboratory  was  to  implement 
a  data  base  that  accurately  reflected  software  de¬ 
velopment.  The  INGRES  data  base  system  operating 
under  the  UNIX  operating  system  on  a  PDP  11/45 


♦Research  supported  in  part  by  grant  NSG-5123  from 
NASA/Goddard  Space  Flight  Center  to  the 
University  of  Maryland. 


computer  at  the  University  of  Maryland  was  chosen 
as  the  basic  data  base  system  [Stonebraker  76]. 

This  activity  resulted  in  the  following  steps: 

A  generalized  table-driven  program  was  implemented 
that  converted  the  raw  typed-in  forms  to  a  format 
acceptable  to  INGRES.  However,  it  soon  became 
apparent  that  the  major  problems  were  not  program 
oriented,  but  were  in  the  human  communication 
necessary  to  carry  out  this  activity. 

Forms  were  frequently  filled  out  containing  names 
not  yet  recognized  by  the  data  base.  Other  fields 
were  sometimes  missing  or  unclear.  Constant  inter¬ 
action  between  the  University  personnel  and  the 
programmers  filling  out  the  forms  became  necessary 
in  order  to  solve  this  problem. 

Thus,  the  first  change  in  procedure  was  to  rewrite 
the  data  validation  program  for  the  PDP  11/70  at 
NASA.  Forms  are  turned  in  to  a  single  individual 
assigned  to  the  Laboratory,  The  form  is  scanned 
manually  and  any  errors  are  brought  to  the  atten¬ 
tion  of  the  programmers.  The  validation  program 
finds  additional  errors  that  can  be  quickly  cor¬ 
rected.  Correct  forms  are  written  to  tape  for 
transmittal  to  the  University. 

This  activity  led  to  a  second  task — a  revision  of 
the  forms.  We  observed  that  the  programmers  pre¬ 
ferred  a  ’’checklist"  format  rather  than  a  set  of 
"fill  in  the  blanks,”  even  if  more  checks  were 
needed  than  blanks.  Many  of  the  early  forms  were 
studied  for  typical  responses  and  the  forms  were 
modified  appropriately.  In  addition,  some  seem¬ 
ingly  useful  information,  but  based  upon  data  that 
was  generally  not  being  given  by  the  programmers, 
has  been  deleted  in  order  to  lessen  the  apparent 
overhead  perceived  by  the  programmers  partici¬ 
pating  in  the  laboratory. 

Another  activity  now  under  way  is  the  movement  of 
the  data  base  to  the  PDP  11//0  at  NASA.  Due  to 
the  smaller  size  of  the  PDP  11/45  at  the  Univer¬ 
sity  and  the  relative  inefficiency  of  INGRES  for 
large-scale  applications,  operation  of  the  Uni¬ 
versity  setup  is  starting  to  become  cumbersome. 

The  PDP  11/70  should  eliminate  that  problem. 

Summarizing  the  activities  of  the  past  year,  sev¬ 
eral  schemes  were  developed  and  we  now  have 
evolved  a  9emi -automatic  process  for  entering 
data  into  a  data  base: 
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1.  Forms  are  turned  in  and  manually  scanned  for 
errors. 

2.  The  forms  are  entered  into  a  validation  pro¬ 
gram  at  NASA.  If  errors  are  present,  the  form  is 
returned  for  corrections.  If  correct,  it  is 
written  to  tape. 

3.  The  tape  of  correct  forms  is  brought  to  the 
University  for  data  base  entry. 

(By  January  1979,  it  is  expected  that  the  corrected 
tape  will  also  be  entered  into  a  data  base  on 
NASA's  PDP  11.  At  that  time  the  decision  will  be 
made  as  to  whether  to  keep  the  University  data 
base  or  to  interface  with  NASA's.) 

Resource  Estimation 


One  early  research  activity  was  the  investigation 
of  resource  utilization.  The  Rayleigh  curve  has 
been  studied  for  larger  projects  and  the  applica¬ 
bility  of  this  theory  in  the  smaller  NASA  environ¬ 
ment  was  investigated. 


Cumulative  costs  for  large-scale  software  develop¬ 
ment  has  been  shown  to  approximate  the  curve 
—a  t  ^ 

K  (1  -  e  )  where  K  is  the  total  project 

cost  and  t  is  the  elapsed  time  since  project  in¬ 
itialization  [Putnam  76].  This  is  usually  repre¬ 
sented  in  its  differential  form  called  a  Rayleigh 

curve:  (2  K  a  t  e  “a  C  ^) ,  and  represents  the 

rate  of  consuming  resources.  This  curve  looks 
somewhat  like  a  normal  distribution  with  a  more  ex¬ 
tended  tail  (see  Figure  2). 


In  our  NASA  environment,  from  the  general  project 
summary  form,  these  numbers  are  obtained: 


this  curve  differ  from  actual  resource  consumption. 
The  conclusion  seems  to  be  that  the  Rayleigh  curve 
is  only  a  crude  approximation  to  reasonable  con¬ 
sumption.  (See  [Basili  and  Zelkowitz  78a]  for  more 
details . ) 

In  order  to  test  this  further,  several  other 
curves  were  correlated  with  the  actual  data  (para¬ 
bola,  trapezoid  and  straight  line)  [Mapp  78].  All 
had  as  good  correlations  to  the  data  as  the  Ray¬ 
leigh  curve.  Thus,  the  Rayleigh  curve  was  no 
better,  and  in  many  cases  worse,  than  other 
estimates. 

In  addition,  Norden's  original  assumptions  involve 
a  linear  growth  in  the  rate  of  understanding  a 
project  [Norden  70].  In  reality,  this  learning 
curve  slows  as  personnel  become  familiar  with  a 
project.  Based  upon  this  assumption,  [Parr  78] 
has  developed  a  curve  based  upon  the  hyperbolic 
secant  that  may  be  more  applicable  in  the  NASA 
environment.  This  and  other  theories  related  to 
the  Rayleigh  curve  are  now  being  studied. 

The  evaluation  performed  in  [Basili  and  Zelkowitz 
78a]  has  led  to  a  set  of  procedures  that  can  be 
used  to  monitor  project  development  in  a  produc¬ 
tion  environment.  While  the  full  set  of  seven 
reporting  forms  may  prove  to  be  too  much  over¬ 
head,  a  set  of  procedures  using  only  three  forms 
can  be  used  to  monitor  project  progress  with 
reasonable  accuracy  [Basili  and  Zelkowitz  78b]: 
the  General  Project  Summary,  submitted  at  each 
project  milestone;  the  Resource  Summary,  giving 
hours  worked  by  all  project  personnel  by  week;  and 
the  Change  Report  Form  giving  all  changes  to  the 
system. 


1.  Ke,  total  estimated  cost  of  the  project  in 
hours  of  effort.  Counting  overhead  items,  like 
typing  support  and  librarians,  total  costs  (K)  are 
usually  112%  of  Ke. 

2.  Y  the  maximal  effort  per  week.  From  this, 

d ,  2 

constant  a  can  be  developed,  a  =  (l/2y^  ). 

3.  Ta  the  estimated  date  of  acceptance  testing. 
Iii  NASA's  environment  this  usually  occurs  after 
88%  of  total  expenses  are  consumed. 

Since  the  Rayleigh  curve  has  two  parameters  (K  and 
a)  and  the  general  project  summary  gives  three 
(ie,  yd  and  t  ),  the  applicability  of  the  Ray¬ 
leigh  cfirve  to*  this  environment  can  be  checked  by 
using  two  of  these  estimates  to  predict  the  third. 

Figure  L  'epresents  this  analysis  for  two  projects. 
Figure  l.A  presents  the  estimated  data  from  the 
general  project  summary.  In  Figure  l.B,  t  was 
estimated  from  K  and  y^  and  y^  was  estimated 
fiom  K  and  t  .  eFinallyi  Figure  l.C  presents  the 
actual  data. 

Figure  2  plots  some  of  this  for  these  two  projects. 
While  Figure  1  shows  that  and  y^  are  accurate 
predictors  of  t  (e.g.,  an  estimate  of  60  weeks 
for  project  A,  only  a  two-week  error  from  the 
actual  62  weeks,  and  a  much  better  estimate  than 
the  InitLal  estimate  of  46  weeks),  the  plots  of 


Error  Analysis 

The  principal  motivations  for  studying  errors  and 
changes  have  been  to  discover  the  effects  of 
various  factors  on  the  number  and  kinds  of  errors 
made  in  system  developments,  and  to  find  ways  to 
evaluate  proposed  software  development  methodolo¬ 
gies. 

To  study  this,  a  number  of  tasks  have  been  per¬ 
formed.  First,  to  assure  that  all  of  the  forms 
have  been  filled  out  in  a  consistent  manner,  a 
glossary  of  terms  has  been  defined  and  made  avail¬ 
able  to  the  participants  of  the  monitored  software 
development  process.  Second,  a  set  of  questions 
of  interest  were  defined  which  were  used  to  tno- 
ti  te  both  the  form  content  and  organize  the 
kinds  of  data  required  in  the  form  of  interviews 
with  the  participants.  Questions  of  interest 
include  the  following: 

What  are  good  ways  of  characterizing  error- 
proneness  of  software  development?  Measures  such 
as  the  total  number  of  errors,  errors  per  line  of 
code,  errors  per  man  hour,  errors  per  component 
type  where  type  refers  to  the  kind  of  sub-applica¬ 
tion  or  level  of  complexity,  number  of  fixes  per 
project  phase  per  component  are  being  considered. 
We  are  also  looking  at  relationships  h- tween  tin- 
various  types  of  error  classification. 
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What  are  the  major  sources  of  errors?  One  possible 
characterization  is  by  analyzing  whether  errors 
are  traceable  back  to  requirements,  specification, 
interface  definition  or  intra-coraponent  design,  or 
clerical  activities  or  the  hardware  environment. 

What  are  appropriate  ways  of  measuring  ease  of 
software  change?  Data  is  being  collected  on  effort 
per  change  in  terms  of  time,  the  number  of  fixes 
required  for  the  change,  and  the  number  of 
errors  generated  by  the  change. 

What  is  the  effect  of  continual  change  on  a  soft¬ 
ware  product?  Data  is  gathered  on  the  cost  of 
change  as  a  function  of  time  and  cumulative 
changes. 

What  type  of  changes  cause  most  of  the  errors? 

This  may  be  very  environment  dependent  or  it  may 
give  some  insights  into  improved  organizations 
and  methodologies  for  software  development. 

What  is  the  effect  of  personnel  organization  on 
errors?  Data  is  being  collected  on  correctness 
as  measured  by  errors  per  number  of  people  work¬ 
ing  on  a  piece  of  software.  Again,  this  should 
shed  some  insights  on  the  way  to  organize  tasks 
within  a  given  environment. 

What  types  of  changes  predominate  during  software 
development?  Knowing  this  should  aid  in  design¬ 
ing  software  to  anticipate  the  possible  changes. 

T./hat  are  the  most  prevalent  error  detection  and 
correction  techniques?  Knowing  what  is  used 
most  often  and  what  works  and  at  what  cost  will 
help  in  determining  what  should  be  used  for  what 
classes  of  errors  in  what  environment. 

What  is  the  effect  of  various  constraints,  such  as 
time  and  memory  on  error  distributions?  Under¬ 
standing  this  will  permit  better  evaluation  of  the 
tradeoffs  in  software  management. 

These  are  but  some  of  the  types  of  questions  the 
error  analysis  phases  of  the  Software  Engineering 
Laboratory  is  studying.  The  data  for  most  of 
these  questions  is  gathered  from  the  Change  Report 
Form,  with  additional  information  from  the  other 
forms  and  follow-up  interviews  to  validate  the 
accuracy  of  the  information  and  gather  additional 
data  not  easily  collected  in  a  form  format. 

Based  upon  the  above  questions,  several  "first 
order  metrics"  have  been  defined  and  software  has 
been  developed  to  gather  information  from  the 
data  base.  Data  is  being  gathered  at  a  slow  pace 
partly  because  of  the  current  backlog  of  Change 
Report  Forms  which  have  not  yet  been  entered  into 
the  data  base,  and  partly  because  of  the  refine¬ 
ment  of  the  form  as  mentioned  in  the  section  on 
Data  Collection  Activities.  Early  analyses  on  a 
couple  of  projects,  however,  do  indicate  that  the 
distribution  of  errors  during  development  appears 
to  approximate  the  Rayleigh  curve  as  found  by 
[Schick  and  Wolverton  78]. 

Continued  effort  will  deal  with  the  gathering  of 


information  to  answer  the  basic  questions  of  inter¬ 
est,  further  development  of  new  questions  of  in¬ 
terest,  and  possible  "second  order  metrics"  based 
on  the  intuition  gathered  from  the  current  studies. 

Program  Complexity 

There  is  much  interest  in  measures  of  complexity 
of  the  software  product,  the  valid  aspects  of  the 
product  that  effect  human  understanding.  There  is 
an  interest  in  quantitatively  measuring  these 
aspects  so  that  characteristics  of  programs  that 
make  them  more  or  less  error  prone,  harder  to 
modify,  or  more  difficult  to  develop  can  be  better 
understood  and  recognized.  Measures  proposed  in 
the  literature  may  even  be  used  to  characterize 
differences  in  the  development  process. 

Work  has  been  done  at  the  University  of  Maryland 
to  analyze  and  compare  the  development  of  soft¬ 
ware  in  an  experimental  environment  to  determine 
the  effects  of  development  methodologies  [Basil! 
and  Reiter  78].  The  experiment  involved  the  use 
of  three  different  types  of  development:  Single 
individuals  using  ad  hoc  techniques,  groups  of 
three  using  ad  hoc  techniques,  and  groups  of  three 
using  a  structured  programming  methodology.  Re¬ 
sults  have  shown  that  there  is  some  distinction 
in  the  product  using  very  rough  measures  of  the 
program  characteristics,  such  as  number  of  if 
statements,  number  of  globals,  etc.  Based  on  this 
study,  the  organized  group  lies  somewhere  between 
the  ad  hoc  group  and  the  single  individual.  How¬ 
ever,  with  regard  to  process  measures,  the 
organized  group  has  shown  less  computer  runs  in 
all  phases  of  development  and  less  errors  (using 
a  measure  of  errors  called  program  changes  which 
is  algorithmically  computable  based  on  different 
versions  of  the  software  product  [Dunsmore  78 J ) . 

It  is  planned  to  implement  the  promising  measures 
from  this  research  on  programs  from  the  NASA  en¬ 
vironment.  Versions  of  the  systems  developed  at 
NASA  have  been  saved  and  will  be  compared  for 
program  changes  and  checked  against  the  result 
from  the  Error  Report  in  the  Change  Report  Forms. 
Further  work  is  being  done  in  automating  and  com¬ 
paring  various  complexity  measures.  These  Include 
several  of  our  own  measures  (prime  program 
hierarchy,  data  bindings,  etc.)  as  well  as  some 
of  the  measures  that  have  appeared  in  the  litera¬ 
ture  [Halstead  77;  McCabe  76] . 
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Initial  Estimates  from 

General  Project  Summary 

PROJECT 

A 

PROJECT 

B 

Ka,  Resources  needed  (hrs) 

Ta,  Time  to  completion  (wks) 
Yd,  Maximum  resources/wk  (hrs) 

14,213 

46 

350 

12,997 

41 

320 

Completion  Estimates  Using 
Rayleigh  Curve 

K,  Resources  needed  (hrs) 

Est.  Yd  with  Ta  fixed  (hrs) 
Est.  Ta  with  Yd  fixed  (hrs) 

16,151 

440 

58 

14,770 

456 

58 

Actual  Project  Data 

K,  Resources  needed  (hrs) 

Yd,  Maximum  resources  (hrs) 

Ta,  Completion  time  (wks) 

17,741 

371 

62 

16,543 

462 

54 

Ta,  Estimated  using  actual 
values  of  K  and  Yd  (wks) 

60 

43 

Figure  1:  Estimating  Ta  and  Yd  from  General 
Project  Summary  Data 


★  -  Estimating  curve  with  Yd  (maximum  resources)  fixed 
+  -  Estimating  curve  with  Ta  (completion  date)  fixed 

•  -  Actual  data 


Figure  2,  Estimated  resource  expenditures  curve 
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ABSTRACT 

Three  software  complexity  metrics  (number  of 
statements,  McCabe's  v(G) ,  and  Halstead's  E)  were 
compared  to  performance  on  two  software  maintenance 
tasks.  In  an  experiment  on  understanding,  length 
and  v(G)  correlated  with  the  percent  of  statements 
correctly  recalled.  In  an  experiment  on  modifica¬ 
tion,  most  significant  correlations  were  obtained 
with  metrics  computed  on  modified  rather  than  un¬ 
modified  code.  All  three  metrics  correlated  with 
time  to  complete  the  modification,  while  only 
length  and  v(G)  correlated  with  the  accuracy  of  the 
modification.  Relationships  in  both  experiments 
occurred  in  unstructured  rather  than  structured 
code,  and  ii.  the  second  experiment  where  no  com¬ 
ments  appeared  in  the  code.  The  metrics  were  also 
most  predictive  of  performance  for  inexperienced 
programmers.  Thus,  these  metrics  appeared  to  assess 
psychological  complexity  only  where  programming 
practices  did  not  provide  assistance  in  understand¬ 
ing  the  code.  In  both  experiments  all  three  metrics 
were  highly  intercorrelated . 


INTRODUCTION 

In  1972,  Halstead  first  published  his  theory 
of  software  physics  (renamed  software  science) 
stating  that  algorithms  have  measurable  character¬ 
istics  analogous  to  physical  laws.  According  to 
Halstead  (1972a,  1972b,  1975,  1977)  the  amount  of 
effort  required  to  generate  a  program  can  be  cal¬ 
culated  from  simple  counts  of  distinct  operators 
and  operands  and  the  total  frequencies  of  operators 
and  operands.  From  these  four  quantities  Halstead 
». al rul.-ites  the  number  of  mental  comparisons  re¬ 
quired  to  generate  a  program.  Halstead's  metrics 
attempt  to  represent  the  psychological  complexity 
of  software.  Correlations  often  greater  than  .90 
(Fitzsimmons  &  Love,  1978)  have  been  reported  be¬ 
tween  Halstead's  metrics  and  such  measures  of  pro¬ 
grammer  performance  as  the  number  of  bugs  in  a 
program  (Cornell  &  Halstead,  1976;  Fitzsimmons, 
1978;  Funami  &  Halstead,  1975),  programming  time 
(Gordon  &  Halstead,  1973;  Halstead,  1976),  and  the 
quality  of  programs  (Bulut  &  Halstead,  1974;  Els- 
hoff,  1976;  Halstead,  1973). 

More  recently  McCabe  (1976)  developed  a  defin¬ 
ition  of  complexity  based  on  the  decision  structure 
of  a  program.  McCabe's  complexity  metric  is  the 


classical  graph  theory  cyclomat ic  number  which  rep¬ 
resents  the  number  of  regions  in  a  graph,  or  in  the 
current  usage,  the  number  of  linearly  independent 
control  paths  comprising  a  program.  Simply  stated, 
McCabe's  metric  counts  the  number  of  basic  control 
paths.  When  combined  these  paths  will  generate 
every  possible  path  through  the  program.  Since  Mc¬ 
Cabe  attempted  to  relate  his  metric  to  the  diffi¬ 
culty  of  testing  a  program,  it  was  presented  as  a 
measure  of  computational  complexity.  Nevertheless, 
the  number  of  basic  control  paths  indexed  by  Mc¬ 
Cabe's  metric  may  also  be  an  important  aspect  of 
psychological  complexity,  since  additional  control 
paths  could  make  a  program  more  difficult  to  under¬ 
stand. 

Since  the  reasons  for  assessing  them  are  com¬ 
pletely  different,  psychological  and  computational 
complexity  should  be  clearly  distinguished  when  in¬ 
terpreting  software  complexity  metrics.  Computa¬ 
tional  complexity  refers  to  characteristics  of  al¬ 
gorithms  or  programs  which  make  their  proof  of  cor¬ 
rectness  difficult,  lengthy,  or  impossible  (Rabin, 
1977).  For  example,  as  the  number  of  distinct  con¬ 
trol  paths  through  a  program  increases,  the  compu¬ 
tational  complexity  also  increases.  Psychological 
complexity  refers  to  characteristics  of  software 
which  make  it  difficult  to  understand  and  work 
with.  Thus,  computational  complexity  assesses  the 
difficulty  of  verifying  an  algorithm's  correctness, 
while  psychological  complexity  assesses  human  per¬ 
formance  on  programming  tasks.  No  simple  relation¬ 
ship  between  computational  and  psychological  com¬ 
plexity  is  expected.  For  example,  a  program  with 
many  control  paths  may  not  be  psychologically  com¬ 
plex,  since  any  regularity  to  the  program's  branch¬ 
ing  process  may  simplify  its  understanding. 

The  research  reported  here  was  designed  to  in¬ 
vestigate  factors  influencing  two  tasks  in  software 
maintenance!  understanding  an  existing  program  and 
implementing  modifications  to  it.  These  factors 
included  structured  programming  techniques,  cogni¬ 
tive  programming  aids,  and  program  complexity. 

While  the  first  two  factors  were  manipulated  exper¬ 
imentally,  no  systematic  attempt  was  made  to  manip¬ 
ulate  program  complexity.  This  paper  reports  data 
from  two  experiments  which  investigated  how  stveral 
complexity  metrics  were  related  to  the  understand¬ 
ing  (Experiment  1)  and  modifying  (Experiment  2)  of 
computer  programs. 


Method 


EXPERIMENT  1 


characters,  while  the  most  mnemonic  names  were  cho¬ 
sen  from  names  suggested  by  a  group  of  programmers 
not  participating  in  the  study. 
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Participants.  Thirty-six  programmers  were 
tested  in  five  different  General  Electric  loca¬ 
tions.  The  participants  had  working  knowledge  of 
FORTRAN  and  averaged  6.8  years  of  professional  pro¬ 
gramming  experience  (SD  °  5.8).  Most  participants 
came  from  an  engineering  background,  while  several 
were  experienced  in  statistical  or  non-numerical 
software. 


Experimental  design.  In  order  to  control  for 
individual  differences  in  performance,  a  within- 
subject,  34  fractional  factorial  design  was  employ¬ 
ed  (Hahn  &  Shapiro,  1966;  Kirk,  1968).  Three  types 
of  control  flow  were  defined  for  each  of  nine  pro¬ 
grams,  and  each  version  was  presented  in  three 
levels  of  variable  mnemonicity  for  a  total  of  81 
experimental  programs. 


Procedure.  Each  participant  was  presented  a 
packet  of  materials  with  written  instructions  on 
the  experimental  tasks.  As  a  preliminary  exercise, 
all  participants  were  presented  the  same  short  roL- 
TRAN  program  and  a  brief  description  of  its  pur¬ 
pose.  They  studied  this  program  for  10  minutes  and 
were  then  given  5  minutes  to  reconstruct  a  func¬ 
tional  equivalent  from  memory.  This  introductory 
program  diminished  learning  effects  prior  to  the 
experimental  phase. 

Following  the  initial  exercise,  participants 
were  presented  sequentially  with  three  separate 
programs  comprising  their  experimental  tasks.  They 
were  allowed  25  minutes  to  study  each  program  and 
could  make  notes  or  draw  flowcharts.  At  the  end  of 
the  study  period,  the  original  program  and  all 
scrap  paper  were  collected.  Each  participant  was 
then  given  20  minutes  to  reconstruct  a  functionally 
equivalent  program  from  memory  on  a  blank  sheet  of 
paper,  but  was  not  required  to  reproduce  the  com¬ 
ment  section.  A  break  of  15  minutes  occurred  be¬ 
fore  the  last  program  was  presented. 


Halsteadfs  E.  Halstead’s  effort  metric  (E) 
was  computed  precisely  from  i  program  (based  on  Ot- 
tenstein,  1976)  whose  input  was  the  source  code 
listings  of  the  27  programs  representing  nine  dis¬ 
tinct  programs  at  each  of  three  levels  of  struc¬ 
ture.  The  computational  formula  was: 


E 

where, 
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=  number  of  unique  operators 
=  number  of  unique  operands 
=  total  frequency  of  operators 
*  total  frequency  of  operands 


McCabe’s  v(G) .  McCabe’s  metric  is  the  classi¬ 
cal  graph-theory  cyclomat ic  numbei  defined  as: 


v(G)  *  //  edges  -  //  nodes  +  2  (#  connected 

components) . 


Programs.  Three  general  classes  of  programs 
were  used:  engineering,  statistical,  and  non-numer¬ 
ical.  Three  programs  were  employed  from  each  class 
with  lengths  varying  from  36  to  57  statements. 

These  nine  programs  were  selected  from  among  many 
solicited  from  programmers  at  several  locations  and 
were  considered  representative  of  programs  actually 
encountered  by  practicing  programmers.  All  experi¬ 
mental  programs  were  executed  using  appropriate 
test  data. 

Complexity  of  control  flow.  Three  control 
flow  structures  were  defined  for  each  program. 
Structured  control  flow  adhered  very  strictly  to 
the  tenets  of  structured  programming  (Dijkstra, 
1972).  When  the  rules  for  structured  programming 
are  applied  rigorously, awkward  constructions  may 
occur  in  standard  FORTRAN,  such  as  DO  loops  with 
dummy  indices  (Tenny,  1974).  In  a  second  version 
these  awkward  construct  ions  were  largely  eliminated 
with  a  more  naturally  structured  control  flow. 

These  conventions  included  multiple  returns,  back¬ 
ward  exits  from  DO  loops,  and  judiciously  used 
backward  CO  TO’s.  In  the  unstructured  version  of 
each  program  the  control  flow  was  not  straightfor¬ 
ward.  Expanded  DO  loops,  arithmetic  IF's,  and  un¬ 
restricted  use  of  CO  TO’s  were  allowed. 

Variable  name  mnemonicity.  Three  levels  of 
mnemonicity  for  variable  names  were  manipulated  in¬ 
dependently  of  program  structure.  The  least  mne¬ 
monic  names  consisted  of  one  or  two  alphanumeric 


McCabe  presents  two  simpler  methods  of  calculating 
v(G) .  For  structured  programs  v(G)  equals  the  num¬ 
ber  of  predicate  nodes  plus  1.  Values  of  v(C)  can 
also  be  computed  from  a  planar  graph  of  the  control 
flow  by  counting  the  number  of  regions. 

Length.  The  length  of  the  program  was  comput¬ 
ed  as  the  total  number  of  FORTRAN  statements  ex¬ 
cluding  comments. 

Dependent  variable.  The  criterion  for  scoring 
programs  was  the  functional  correctness  of  each 
separately  reconstructed  statement.  Variable  names 
and  statement  numbers  which  differed  from  those  in 
the  original  program  were  counted  as  correct  when 
used  consistently.  Control  structures  could  be 
different  from  the  original  program  so  long  as  the 
statement  performed  the  same  function.  The  score 
on  each  experimental  task  was  the  percent  of  state¬ 
ments  correctly  recalled. 

Results 

Experimental  manipulations.  A  complete  report 
of  the  experimental  results  is  presented  in  Shep¬ 
pard,  Borst,  Curtis,  and  Love  (in  press).  Briefly, 
a  mean  of  50%  of  the  statements  were  correctly  re¬ 
called  across  all  programs  and  experimental  condi¬ 
tions.  Substantial  differences  in  performance 
were  observed  across  the  nine  programs.  Perfor¬ 
mance  on  naturally  structured  programs  was  superior 
to  that  on  unstructured  programs.  Dlfferemes  In 
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the  mnemonic ity  of  the  variable  names  did  not  af¬ 
fect  performance. 

Software  complexity  metrics.  Since  different 
levels  of  variable  mnemonicity  neither  affected 
performance,  nor  caused  any  change  in  the  value  of 
the  complexity  metrics  for  a  particular  program, 
the  data  reported  in  this  section  were  aggregated 
over  the  three  levels  of  variable  mnemonicity  on 
each  of  the  nine  programs  at  each  of  the  three 
levels  of  structure.  Thus,  each  of  the  27  data 
points  represented  the  average  of  at  least  three 
performance  scores. 

Halstead's  E  and  McCabe's  v(G)  were  highly 
correlated  (r  =  .84,  ££  .001),  while  length  dis¬ 
played  only  moderate  relationships  with  these  two 
metrics.  The  correlations  between  performance  and 
the  complexity  metrics  were  all  negative,  indicat¬ 
ing  that  fewer  lines  were  recalled  as  the  level  of 
complexity  represented  by  these  three  metrics  in¬ 
creased.  Performance  was  moderately  related  to 
length  0"  =  -.53,  £  £  .01)  and  McCabe's  v(G)  (_r  = 
-.35,  ££  .05),  but  not  to  Halstead's  E. 

The  complexity  of  the  control  flow  moderated 
the  relationship  between  performance  and  the  com¬ 
plexity  metrics.  That  is,  while  insignificant  cor¬ 
relations  were  observed  when  the  control  flow  was 
structured  or  naturally  structured,  this  was  not 
the  case  for  unstructured  code.  Correlations  with 
performance  of  ,55  (£  £  .001)  and  .45  (£  £  .01)  for 
v(G)  and  E  were  observed  on  unstructured  pro¬ 
grams. 

A  similar  moderating  effect  was  observed  for 
a  programmer's  extent  of  professional  experience. 
For  programmers  with  three  or  less  years  of  pro¬ 
fessional  experience,  correlations  of  -.47  (£  £ 
.001)  for  McCabe's  v(C.)  and  -.35  (£  £  .05)  for  Hal¬ 
stead's  E  were  observed.  Insignificant  correla¬ 
tions  were  observed  for  programmers  with  more  than 
three  years  experience. 

EXPERIMENT  2 

Method 

Participants.  As  in  the  previous  experiment, 
the  sample  for  this  experiment  consisted  of  36  pro¬ 
fessional  programmers  from  three  General  Electric 
locations.  The  participants  averaged  5.9  years  of 
professional  programming  experience  (SD  =  4.1),  had 
a  working  knowledge  of  FORTRAN,  and  none  had  parti¬ 
cipated  in  the  previous  experiment. 

Procedure.  Generally,  the  procedures  employed 
in  this  experiment  were  identical  to  those  used  in 
Experiment  1.  In  a  preliminary  exercise,  all  par¬ 
ticipants  were  asked  to  modify  the  same  short  FOR¬ 
TRAN  program.  Following  this  initial  exercise, 
participants  were  presented  in  turn  with  the  three 
programs  comprising  their  experimental  tasks.  One 
modification  was  requested  for  each  program  and  was 
described  on  a  sheet  accompanying  the  program  list¬ 
ing.  Participants  were  allowed  to  work  at  their 
own  pace,  taking  as  much  time  as  needed  to  imple¬ 
ment  the  modification. 


Experimental  design.  In  order  to  control  for 
individual  differences  in  performance,  a  within- 
subject,  3**  factorial  design  was  employed.  Three 
of  the  nine  programs  from  Experiment  1  were  used. 
Three  types  of  control  flow  were  defined  for  each 
of  the  three  programs  using  the  same  approach  des¬ 
cribed  previously.  Each  of  these  nine  versions  was 
presented  with  one  of  three  types  of  commenting. 
Modifications  at  three  different  levels  of  diffi¬ 
culty  were  developed  for  each  program  generating  a 
total  of  81  experimental  programs.  Participants 
were  randomly  assigned  into  the  experimental  de¬ 
sign. 

Comments .  Three  levels  of  commenting  were 
tested  in  this  experiment:  global,  in-line,  and 
none.  Global  comments  provided  an  overview  of  the 
function  of  the  program  and  identified  the  primary 
variables.  In-line  comments  were  interspersed 
throughout  the  program  and  described  the  specific 
functions  of  small  sections  of  code. 

Modifications.  Three  types  of  modifications 
were  selected  for  each  program  as  typical  changes  a 
programmer  might  be  expected  to  implement.  The 
level  of  difficulty  for  seven  of  the  nine  modifica¬ 
tions  increased  as  more  lines  had  to  be  added  to 
the  original  code,  and  the  hardest  modification  for 
each  program  required  the  most  additional  lines. 

Dependent  variables.  The  dependent  variables 
were  the  correctness  of  the  modification  and  the 
time  taken  by  the  participant  to  complete  the  task. 
The  individual  steps  necessary  for  the  correct  im¬ 
plementation  of  each  requested  modification  were 
delineated  in  advance  and  assigned  equal  weights. 
That  is,  prototypes  of  each  program  with  each  mod¬ 
ification  correctly  implemented  were  established  as 
the  criteria  against  which  participants'  work  would 
be  compared.  A  percentage  score  representing  the 
functional  correctness  of  each  modification  was 
computed  by  comparing  a  participant's  changes  with, 
the  prototype  version.  All  values  for  the  complex¬ 
ity  metrics  computed  on  the  modified  programs  were 
computed  on  the  prototypes  with  correct  implementa¬ 
tions  rather  than  from  code  generated  by  the  parti¬ 
cipants.  The  time  to  implement  a  modification  was 
timed  to  the  nearest  minute  using  an  electronic' 
timer. 

Results 

Experimental  manlpul at  ions .  A  complete  report 
of  the  results  for  the  manipulations  in  this  exper¬ 
iment  appears  in  Sheppard,  Borst,  Curtis,  and  Love 
(in  press).  Briefly,  across  all  experimental  con¬ 
ditions  an  average  accuracy  score  of  62%  was  recei¬ 
ved  on  the  modi f ications  (SD  =  31%).  The  108  accu¬ 
racy  scores  ranged  from  five  scores  of  0%  to  24 
scores  of  100%  and  were  negatively  skewed.  The  av¬ 
erage  time  to  complete  the  modification  was  17.9 
minutes  (SD  *  11.4),  ranging  from  2  to  59  minutes 
with  a  positive  skew.  Accuracy  and  time  were  un- 
correlated. 

On  two  of  the  three  programs  studied,  the  ac¬ 
curacy  of  the  modification  was  found  to  be  modestly 
affected  by  both  the  difficulty  of  the  modification 
and  the  use  of  structured  programming  techniques. 


More  accurate  modif teat  ions  were  made  to  strictly 
Structured  rather  than  unstructured  programs.  The 
more  difficult  modifications  took  longer  to  imple¬ 
ment  ,  although  structured  programming  techniques 
had  no  effect  on  time  to  completion.  Neither  time 
nor  accuracy  was  affected  by  the  type  of  comment¬ 
ing. 

Software  complexity  metrics.  Correlations 
among  Halstead's  and  McCabe's  metrics  and  length 
were  quite  high  on  both  the  original  and  modified 
programs  (.88  £^£  .97,  £<  .001).  Correlations 
between  the  three  complexity  metrics  and  the  two 
dependent  variables  were  larger  in  the  aggregated 
than  in  the  unaggregated  data.  Most  significant 
correlations  with  performance  were  observed  for 
metrics  computed  on  the  modified  programs.  All 
three  metrics  were  moderately  correlated  with  time 
to  complete  the  modification  (.38  £  £  £  -^6*  £  £ 
.05),  while  only  length  and  McCabe's  v(G)  were  sig¬ 
nificantly  related  to  accuracy  (-34  £££“.36,  £  £ 
.05). 

Similar  to  results  in  Experiment  1,  the  rela¬ 
tionship  between  time  to  completion  and  Halstead's 
E  was  moderated  by  the  complexity  of  the  control 
flow.  The  correlations  for  Halstead's  E  with  per¬ 
formance  went  from  .08  in  the  structured  code,  to 
.28  (£  £  .05)  in  naturally  structured  code,  to  .38 
(£  £  .05)  in  unstructured  code.  No  such  moderat¬ 
ing  effects  were  observed  for  McCabe's  v(G) . 

Correlations  between  the  complexity  metrics 
and  performance  measures  were  also  moderated  by  the 
type  of  commenting.  Significant  correlations  on 
modified  programs  were  observed  when  no  comments 
were  included  in  the  program  for  both  accuracy 
(-.34  <  r  <  -.35,  £  £  .05)  and  time  (.44  £  £  £  .47, 
£  £  .01). 

The  amount  of  professional  programming  exper¬ 
ience  profoundly  moderated  the  relationships  obser¬ 
ved  between  the  complexity  metrics  and  time  to  com¬ 
pletion.  although  no  such  effect  was  observed  for 
accuracy.  Significant  correlations  (.52  <  r  <  .55, 
p  <  .001 )  were  observed  for  programmers  with  three 
' i  less  years  of  professional  experience,  while  no 
correlations  above  .20  were  observed  for  program¬ 
mers  with  more  than  three  years  experience. 

CONCLUSIONS 

The  two  experiments  comprising  this  study  pro¬ 
duced  empirical  evidence  that  software  complexity 
metrics  were  related  to  the  difficulty  programmers 
experienced  in  understanding  and  modifying  pro¬ 
grams.  Deeper  analysis  indicated,  however,  that 
the  Halstead  and  McCabe  metrics  predicted  program¬ 
mers*  performance  only  on  certain  programs.  Pro¬ 
grams  on  which  significant  prediction  was  observed 
were  character/  >:d  by  the  absence  of  programming 
practices  such  as  structured  coding  or  commenting 
which  provide  assistance  to  a  programmer  attempt¬ 
ing  to  understand  the  code.  These  complexity  met¬ 
rics  were  more  predictive  of  the  performance  of 
•ess  experienced  programmers. 

Assessment  of  the  psychological  complexity  of 
software  appears  to  require  more  than  a  simple 


count  of  operators  and  operands  or  basic  control 
paths.  Many  programs  have  characteristics  unasses¬ 
sed  by  these  metrics  which  may  heavily  influence 
psychological  complexity.  For  instance,  the  use  of 
structured  coding  techniques  or  comments  reduces 
the  cognitive  load  on  a  programmer  in  ways  unasses¬ 
sed  by  the  complexity  metrics.  Further,  complexity 
metrics  may  not  be  capturing  the  most  important 
constructs  for  predicting  the  performance  of  exper¬ 
ienced  programmers  who  may  either  be  conceptualiz¬ 
ing  programs  at  a  level  other  than  that  of  opera¬ 
tors,  operands,  and  basic  control  paths,  or  who  can 
fit  the  program  into  a  schema  similar  to  one  they 
have  had  previous  experience  with. 

Even  though  moderating  effects  were  observed 
in  these  data,  stronger  relationships  with  per¬ 
formance  may  have  been  masked  by  the  effects  of 
differences  between  individuals  and  programs  which 
were  enhanced  by  limitations  in  the  economical  mul¬ 
tifactor  designs  employed.  Uniformity  in  the  sizes 
of  programs  studied  may  also  have  limited  these  re¬ 
sults.  The  range  of  values  assumed  by  complexity 
metrics  computed  on  these  programs  may  have  been 
insufficient  for  correlat ion.il  tests  to  detect  the 
strong  relationships  reported  in  other  verifica¬ 
tions  of  these  theories  (Edwards,  19/6).  Studies 
reporting  higher  correlations  for  Halstead’s  E  us¬ 
ually  involved  a  broader  range  of  program  sizes 
(Fitzsimmons  &  Love,  1978,  Halstead,  1977). 

The  number  of  statements  in  the  code  proved  to 
be  as  good  a  predictor  of  performance  on  the  exper¬ 
imental  tasks  as  the  metrics  developed  by  Halstead 
and  McCabe.  These  results  did  not  concur  with 
those  of  Gordon  (1977)  who,  in  reanalyzing  Love's 
(1977)  data  on  programmer  understanding,  found  Hal¬ 
stead’s  metric  better  than  the  number  oi  statements 
for  predicting  the  percentage  of  statements  recall¬ 
ed.  Correlations  among  the  metrics  suggested  sub¬ 
stantial  overlap  in  the  constructs  they  quantified 
on  the  modular-sized  programs  studied.  Had  these 
correlations  been  smaller,  the  predictive  worth  of 
the  metrics  could  have  been  bettor  compared.  The 
number  of  statements  may  not  be.  as  highly  correla¬ 
ted  with  the  Halstead  and  McCabe  metrics  for  pro¬ 
grams  of  substantially  greater  length,  for  other 
types  of  applications,  or  for  programs  written  in 
languages  other  than  FORTRAN. 

A  characteristic  distinguishing  psychological 
from  computation.il  complexity  is  that  tin-  psyrho- 
logical  complexity  of  software  involves  an  interac¬ 
tion  between  program  characteristics  ami  individual 
differences,  such  as  programming  experience,* 
Chrysler  (1978)  demonstrated  the  value  of  these  ex¬ 
periential  variables  in  predicting  the  time  to  com¬ 
plete  a  programming  task.  Individual  differences 
should  not  be  overlooked  in  predicting  human  per¬ 
formance,  especially  when  the  performance  ratio 
comparing  good  to  bad  programmers  has  been  reported 
as  high  as  28  to  1  (Grant  &  Sack man,  1967). 

The  complexity  metrics  provided  some  sources 
of  information  about  program  differences,  but  there 
were  other  factors  within  the  programs  unassessed 
by  these  metrics  which  may  have  influenced  psycho¬ 
logical  complexity.  Neither  Halstead's  nor  Mc¬ 
Cabe's  metrics  consider  the  level  of  nesting  within 
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various  const ruct ions.  The  complexity  of  rpree  DO 
loops  in  succession  was  rated  identically  to  three 
nested  DO  loops,  although  nesting  may  influence 
complexity.  If  the  ability  of  complexity  metrics 
to  predict  human  performance  on  programming  tasks 
is  to  be  improved,  then  metrics  must  be  designed 
which  measure  phenomena  related  by  psychological 
principles  to  memory,  information  processing,  and 
problem  solving.  Thus,  while  the  number  of  control 
paths  may  be  critical  in  computational  complexity, 
variations  in  the  arrangement  and  connections  among 
these  control  paths  may  exert  profound  influence  on 
the  difficulty  of  understanding  the  functioning  of 
the  program.  Future  work  in  the  area  of  psycholog¬ 
ical  complexity  should  identify  a  set  of  cognitive 
psychological  principles  relevant  to  programming 
tasks.  Metrics  could  then  be  developed  which  as¬ 
sess  the  qualities  of  software  which  are  most 
closely  related  to  these  principles.  Such  an  exer¬ 
cise  might  not  only  lead  to  improved  metrics  for 
assessing  psychological  complexity,  but  may  also 
Identify  some  programming  practices  which  could 
lead  to  simplified,  more  easily  maintained  soft¬ 
ware. 
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The  software  measurement  studies 
began  at  General  Motors  Research  in  1973 
with  the  development  of  a  PL/I  scanner. 

The  scanner  was  written  to  make  one  pass 
over  the  code  to  be  s.canned  and  to  look 
ahead  one  token.  The  scanner  was 
modularised  so  that  various  counters  could 
easily  be  inserted  into  it  to  record  the 
occurrence  of  particular  constructs  in  the 
PL/I  programs  to  be  scanned.  The  recorded 
data  could  then  be  studied  by  using  a 
small  interactive  query  system  that 
existed  on  the  same  computer  as  the 
scanner . 

In  1974  a  large  volume  of  PL/I  code 
was  collected  from  five  representative 
General  Motors  data  processing  centers.  A 
subset  of  120  of  the  collected  programs 
that  comprised  over  100,000  PL/I  source 
statements  were  used  for  a  controlled 
empirical  study.  The  significant  data 
from  this  study  are  published  l  1  1  with 
only  minor  interpretation  in  order  that 
anyone  may  study  the  data  from  a 
particular  point  of  view.  The  data 
indicate  how  the  PL/I  language  was  being 
used  in  the  early  1970's.  An  averse 
compilable  unit  for  example  was  made  up  of 
a  single  procedure  containing  853  PL/I 
statements.  Control  structures  for  those 
programs  were  very  complex.  Data  handling 
and  expressions  were  found  to  be 
relatively  simple  but  voluminous. 

The  data  that  had  been  collected  were 
then  analyzed  l 2  )  with  respect  to  good 
programming  practices.  Most  of  these 
practices  fall  into  the  class  described 
genencally  as  structured  programming. 

The  reason  for  using  these  practices  lies 
in  common  sense  and  ]ust  a  few  published 
experiences  even  though  the  practices  are 
widely  touted.  The  program  units  were 
observed  to  be  too  large  since  an  average 
program  of  853  PL/Z  statements  contained 
386  identifiers,  1087  constants,  910 
expressions,  319  flow  of  control 
statements  (including  100  GO  TO 
statements),  and  50  statement  labels.  The 
programs  were  found  to  be  extremely 
difficult  to  read.  The  programs  were 


determined  to  be  extremely  complex  with 
respect  to  both  control  flow  and  data 
flow.  The  PL/I  programming  language  was 
used  poorly;  however,  only  a  limited 
subset  of  PL/I  was  necessary  to  perform 
the  computing  that  was  required. 

A  corporate  training  program  was 
introduced  as  a  result  of  the  analysis 
that  had  been  performed.  Programmers  and 
analysts  were  introduced  to  structured 
design  and  structured  programming 
techniques.  After  some  programmers  had 
worked  with  the  new  techniques  for  awhile, 
a  new  set  of  programs  that  had  been 
developed  aftei  the  training  period  were 
collected  and  analyzed.  The  analysis  of 
the  new  programs  (3)  showed  that  their 
profiles  reflected  the  training  to  which 
the  programmers  had  been  exposed. 

Although  no  specific  study  was  performed 
to  determine  the  overall  effectiveness  of 
the  new  design  and  programming  techniques 
with  respect  to  the  software  life  cycle, 
project  case  studies  indicated  reductions 
of  over  25%  in  development  time  could  be 
achieved.  Also,  case  studies  show  that 
maintenance  of  a  program  once  it  has  been 
committed  to  production  can  be  reduced  by 
as  much  as  a  factor  of  eight. 

One  of  the  major  problems  that 
surfaced  during  the  analysis  of  the  first 
set  of  programs  and  again  during  the 
training  program  was  that  modularization 
seemed  foreign  to  the  analysts  and 
programmers.  In  fact,  modularization  was 
viewed  as  detrimental  in  all  aspects  of 
the  software  life  cycle  due  to  the 
perceived  overhead  of  CALL/P. ETURN 
mechanisms  and  the  cost  of  packaging 
programs  m  small  individually  compilable 
units.  A  concerted  effort  was  made  to 
break  down  this  barrier  against 
modularization  in  any  form.  One  form  of 
this  work  was  a  model  of  compilation  costs 
(4)  that  indicates  that  compilation  costs 
are  miniimized  when  programs  are  packaged 
in  relatively  small  units. 

The  data  that  was  gathered  as  part  of 
the  program  analysis  phase  of  our  work  has 
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also  been  used  in  some  experiments  with 
Halstead's  software  science  IS).  The  data 
show  that  the  relationship  between 
measured  length  and  estimated  length  of 
programs  holds  for  programs  extending  over 
several  magnitudes  in  length  [61.  On  the 
other  hand,  the  conjecture  that  the  second 
set  of  analyzed  programs  would  be  more 
consistent  with  respect  to  the  length 
relationship  than  the  first  set  of 
programs  was  not  upheld. 

A  better  understanding  of  both 
software  science  and  GM's  PL/I  programs 
resulted  from  applying  the  theory  of 
software  science  to  the  programs.  This 
work  also  resulted  in  a  joint  effort  with 
Halstead  and  Gordon  which  led  to  two 
significant  extensions  of  software  science 
[71.  A  global  level  of  an  algorithm  which 
depends  only  upon  a  language  and  its  use 
was  derived.  This  development  led  to  a 
predictive  measure  for  estimating  the  time 
required  to  write  a  program.  The  global 
level  developed  m  this  work  is  of 
interest  since  it  ranked  an  assembler 
language.  FORTRAN,  ALGOL  58,  poorly  used 
PL/I,  well  written  PL/I,  and  English  in 
the  same  order  with  respect  to  the  level 
of  algorithm  expression  that  one 
intuitively  ranks  them. 

Even. though  software  science  is  based 
on  counting  operators  and  operands  used  to 
express  an  algorithm,  the  exact 
classification  of  a  language  token  as  an 
operator  or  an  operand  is  not  always 
clear.  This  classification  problem  was 
studied  in  an  experiment  which  applied 
different  counting  methods  to  a  fixed  set 
of  PL/I  programs  [81.  Some  properties  of 
the  algorithms  were  found  to  vary 
significantly  depending  on  the  counting 
method  that  was  used;  other  properties 
remained  stable.  Although  no  one  counting 
method  uas  shown  to  be  best,  the  results 
indicate  the  importance  of  the  counting 
method  to  the  overall  measurement  of  an 
algorithm.  Moreover,  the  results  provide 
a  reminder  of  how  sensitive  some  of  the 
measurements  are  and  of  how  careful 
researchers  must  be  when  drawing 
conclusions  from  software  science 
measurements . 

In  a  recent  study  19),  several  models 
of  operator  distributions  are  compared 
with  the  measured  distributions  found  in 
some  PL/I  programs.  The  theory  embodies 
models  of  operator  distributions  that  have 
been  proposed  by  Bayer,  Zipf,  and  Zweben. 
Two  of  the  three  models,  one  by  Zipf  and 
one  by  Zweben,  are  shown  to  correlate 
highly  with  the  measured  distributions. 

Two  variations  of  the  Zipf  model  are  then 
formulated  and  tested  with  good  results. 
Also,  the  relationship  between  the  method 
of  counting  operators  and  the  models  is 
inve  s  t iga ted . 


The  work  in  measurement  and  empirical 
studies  has  been  temporarily  suspended 
during  the  past  year  while  a  new  PL/I 
scanner  has  been  implemented.  The  new 
scanner  builds  an  abstract  tree 
representation  of  a  program  which  uill 
permit  much  more  extensive  measurement 
capability  than  has  been  available  in  the 
past.  It  is  hoped  that  the  next  set  of 
programs  that  are  collected  for  study  uill 
have  some  historical  data  of  their 
development  available  to  provide  a  basis 
for  comparison  as  we  continue  our  program 
analysis  studies. 
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Introduction 

At  the  Airlie  House  Workshop  on  Soft¬ 
ware  Life  Cycle  Management  a  year  ago,  a 
few  of  us  were  just  becoming  aware  of  the 
possibility  of  applying  the  experimental 
and  theoretical  results  of  software 
science  to  the  real  world  problems  of  com¬ 
puter  systems  management.  In  this  paper, 
a  year  later,  I  will  attempt  to  review  for 
this  workshop  some  of  the  more  important 
new  findings  which  other  software  scien¬ 
tists  have  reported,  discuss  their  signif¬ 
icance,  and,  if  time  permits,  consider  an 
area  in  which  further  work  might  be  quite 
profitable . 

Even  though  all  of  the  work  I  will 
report  on  is  rather  intimately  interrelated, 
it  can  conveniently  be  divided  into  a 
number  of  areas  in  which  important  progress 
has  been  described  in  sufficient  detail 
that  I  can  summarize  it  for  you.  Special 
attention  will  be  given  to  the  new  results 
in  the  areas  of  programming  rates,  precis¬ 
ion  and  accuracy  of  relations,  error  rates, 
and  operator  frequency  distributions.  For 
anyone  unfamiliar  with  the  basic  software 
metrics  and  equations,  either  the  mono¬ 
graph  [ HAL77 ]  or  the  paper  in  last  year's 
proceedings  [HAL77b]  provide  adequate  back¬ 
ground,  and  the  notation  used  there  will 
be  followed. 

Programming  Rates 

In  the  December  1977  issue  of  the  IBM 
Systems  Journal  Claude  Walston  and  C.  P. 
Felix  presented  a  terse  but  highly  signi¬ 


ficant  extension  "On  Lines  of  Code  and  Pro¬ 
grammer  Productivity"  [WAF77b]  to  their 
earlier  paper  on  "A  Method  of  Programming 
Measurement  and  Estimation"  (WAF77).  In 
the  extension  Walston  approximated  the 
software  science  transcendental  equations 
relating  programming  time  to  lines  of 
source  code  by  the  expression 

P  =  1.27  T2/s  (1) 

where  P  is  in  thousands  of  source  state¬ 
ments  and  T  is  in  man-months.  He  then 
performed  a  least-squares  fit  to  the  60 
large  systems  in  their  IBM  data  base  and 
obtained 

P  =  0 .925  T°  ‘  70  (2) 

While  the  degree  of  agreement  between 
the  theory  (Equation ( 1 ) )  and  experience 
(Equation (2) )  is,  in  Walston's  words,  in¬ 
terestingly  close,  it  takes  a  bit  of  pon¬ 
dering  to  realize  the  full  significance 
of  his  finding.  First,  we  must  realize 
that  his  data  base  covers  a  range  of  pro¬ 
gramming  projects  of  from  less  than  12 
man-months  to  a  maximum  of  11,758  man- 
months.  This  suggests  that  the  effect  of 
job  size  on  programming  rates ,  not  just 
total  programming  times,  may  be  well  de¬ 
fined  by  the  data,  as  represented  by 
equation  ( 2) . 

Consequently,  we  can  use  either  one 
of  these  equations  to  obtain  the  number 
of  source  lines  per  programmer  per  month 
for  any  given  job  size,  where  the  job  size 
can  be  expressed  either  in  total  lines  of 
source  code  or  in  total  implementation 
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time.  Using  lines  of  source  code,  we  have 


Joo  Size 

Source 

Lines  per  Man-Month 

( LOSC) 

Theory 

Observation 

10,000 

453 

333 

100,000 

143 

1  24 

1 ,000,000 

46 

46 

and  using  total 

man-months  gives 

Job  Size 

Source 

Lines  per  Man-Month 

(Man-Months) 

Theory 

Observation 

12 

555 

439 

120 

257 

220 

1 ,200 

1  20 

110 

12,000 

55 

55 

Expressed 

in  this 

way,  it  is  apparent 

that  we  now  nave,  quantitatively,  a  the¬ 
oretical,  experimentally  confirmed  expla¬ 
nation  for  not  just  average  programming 
times,  but  the  decrease  in  hourly  pro¬ 
duction  rates  accompanying  increases  in 
project  size. 

When  one  remembers  that  the  effort 
relation  was  originally  tested,  and  act¬ 
ually  derived,  with  programming  tasks  re¬ 
quiring  from  5  to  90  minutes ,  its  exten¬ 
sion  to  real  world  projects  of  this  mag¬ 
nitude  suggests  that  the  most  important 
variables  have  been  identified,  and  their 
role,  on  the  average  or  statistically 
speaking,  is  now  understood. 

From  the  engineering  point  of  view 
there  is  little  to  choose  between  equation 
(1)  and  equation  (2),  because  they  both 
give  about  the  same  answers.  But  from  the 
scientific  point  of  view,  the  difference 
is  fundamental.  This  is  true  because  the 
theoretical  equation  was  derived  from 
first  principles.  Its  coefficients  depend 
only  upon  four  measurable  quantities:  1) 
the  rate  at  which  the  brain  can  make 
elementary  mental  discriminations,  or 
Stroud  Number  (18  e .m.d. /sec . ) ;  2)  the 
average  number  of  operators  and  operands 
in  an  executable  statement  (7.5);  3)  the 


average  fraction  of  total  statements  which 
are  executable  (1/2) ;  and  4)  the  average 
language  level  (1=1.34).  Consequently,  if 
any  one  of  these  parameters  is  changed, 
the  effect  on  programming  rates  is  calcu¬ 
lable  . 

Precision  and  Accuracy 

The  question  of  how  closely  we  should 
expect  the  various  equations  of  software 
science  to  reflect  individual  programs  has 
been,  in  principle,  unanswered.  Most  of 
us  working  in  the  field  have  been  sub¬ 
scribing  to  the  view  that  software  science 
is  a  bit  like  actuarial  statistics.  In 
that  field,  for  example,  one  might  find 
that  men  at  age  65  have  a  life  expectancy 
of  1 4  years.  But  this  in  no  way  guaran¬ 
tees  that  any  particular  65  year  old  man 
will  not  be  killed  by  a  truck  in  the  next 
hour.  In  other  words,  the  accuracy  of  the 
actuarial  prediction  is  completely  ade¬ 
quate,  but  its  precision  is  so  poor  that 
for  any  individual  case  it  may  be  useless. 

Now  if  we  examine  a  recent  study  of 
34  General  Motors  PL/I  programs  by  Elshoff 
[ELS 78] ,  we  see  that  the  correlation  be¬ 
tween  the  observed  lengths  of  programs  and 
the  lengths  calculated  from  the  vocabulary 
-length  equation  is  a  highly  significant 
0.988,  and  the  error  of  the  mean  is  only 
6.5%,  according  to  his  Table  4.  But  if 
we  examine  the  errors  for  individual  pro¬ 
grams  in  that  table,  we  see  that  they 
average  13.1  per  cent  on  an  absolute  basis. 
For  most  engineering  or  management  purposes, 
frequently  concerned  with  differences 
among  programs  ranging  from  500  to  50,000 
statements,  given  a  13  per  cent  error  in 
an  individual  program  is  not  too  serious. 

But  from  the  scientific  point  of  view 
such  errors  raise  an  interesting  and  per¬ 
haps  important  question.  If  we  considered 
only  one  of  Elshoff 's  34  programs,  would 
its  deviation  from  the  vocabulary-length 
relation  be  attributable  merely  to  the  way 


in  which  it  had  been  written,  or  might  it 
instead  depend  upon  the  task  it  was  de¬ 
signed  to  perform?  If  it  is  the  later, 
then  the  relation  is  indeed  akin  to 
actuarial  statistics,  devoid  of  causes, 
but  if  it  is  the  former  then  it  results 
only  from  "experimental  error".  This 
question  could  be  resolved  if  each  of  the 
34  different  programs  was  independently 
rewritten  ten  or  twenty  times.  Then,  if 
the  mean  values  for  each  of  the  34  tasks 
showed  the  same  average  deviation,  that 
deviation  should  be  attributed  to  some  un¬ 
identified  characteristic  of  the  task.  On 
the  other  hand,  if  the  process  of  averag¬ 
ing  over  many  independent  implementations 
of  each  task  reduced  the  deviation,  then 
we  would  be  safe  in  concluding  that  the 
length-vocabulary  relation  is  completely 
independent  of  the  task.  If,  in  addition, 
it  could  be  shown  that  such  an  averaging 
procedure  yields  increasingly  close  agree¬ 
ment  with  other  software  relationships, 
then  the  "actuarial  statistics"  analogy 
would  have  to  be  abandoned. 

While  the  cost  of  performing  the  ex¬ 
periment  described  above  is  prohibitive,  a 
smaller  but  much  more  comprehensive  study 
recently  completed  by  Woodfield  [W0078] 
appears  to  settle  the  question  in  the 
affirmative . 

Woodfield  reports  upon  two  small 
related  programs,  each  written  independ¬ 
ently  by  18  Fortran  programmers.  The  first 
program  had  four  input/output  parameters, 
or  n*=4.  The  second  program  merely  in¬ 
creased  the  problem  dimensions  by  one,  or 
* 

n2  =  5.  The  remarkable  thing  about  Wood- 
field's  study  is  that  in  both  cases  he  was 
able  to  predict  not  only  the  observed 
length,  but  seven  other  characteristics  of 
the  programs  from  nothing  but  the  two 
values,  language  level  X  =  1 ,  n2=4,5.  The 
results  taken  from  his  Table  2  are  as 
follows . 


Property  Original  Version 


Predicted 

Observed 

n  i 

Unique  Operators 

9.27 

9.89+1 .81 

n2 

Unique  Operands 

8.84 

8.72+1 .18 

n 

Vocabulary 

18.11 

18.61+1.69 

N 

Length 

57.57 

61  .  83±7 .72 

V 

Volume 

241 

261+39 

L 

Leve  1 

.06*45 

.0585+ .0146 

V* 

Potential  Volume 

15.51 

14.95+3.02 

* 

n2 

I/O  Parameters 

4 

3.84+. 76 

Property 

Extended  Version 

■■■ 

Predicted 

Observed 

hi 

Unique  Operators 

10.59 

10.61+1 .75 

n2 

Unique  Operands 

13.11 

12.44+2.83 

n 

Vocabulary 

23.70 

23.06+2.98 

N 

Length 

84.53 

95.89+20.95 

V 

Volume 

386 

436+108 

L 

Level 

.0509 

.0481+. 0117 

V* 

Potential  Volume 

19.65 

?0 . 77+6 . 85 

* 

"2 

I/O  Parameters 

5 

5.21+1.58 

In  considering  the  agreement  between 
the  values  predicted  by  theory  and  those 
observed,  Woodfield  notes  that  in  no  case 
was  the  discrepancy  as  great  as  six-tenths 
of  the  standard  deviation,  and  that  the 
average  error  of  the  16  predictions  was 
only  1.4  per  cent . 

In  light  of  these  results  it  seems 
safe  to  assume,  for  small  programs  at 
least,  that  deviations  between  observation 
and  theory  can  be  attributed  to  sampling 
error,  rather  than  to  unidentified  charac¬ 
teristics  of  the  programs  themselves. 

Error  Rates 

Linda  Ottenstein  has  recently  com¬ 
pleted  a  comprehensive  study  [OTT78]  re¬ 
lating  programming  error  rates  to  the  soft¬ 
ware  parameters.  While  she  analyzes  a 
number  of  sets  of  data,  including  studies 
of  her  own,  perhaps  the  most  significant 
finding  concerns  the  data  of  Lipow  and 
Thayer  [LIT77J .  She  derives  the  predicted 


number  of  "Validation"  bugs,  Bv  as 


Bv  =  V*/LE0  =  V/3000 


where  V  is  the  volume,  V*  is  the  potential 
volume,  L  is  the  implementation  level  and 
the  constant  Eo=3000  is  obtained  directly 
from  the  psychological  concept  of  chunking. 
Applied  to  the  functionally  grouped  data 
of  [LIT77],  this  gives,  according  to  her 
Table  2.3. 

Routine  Delivered  Bugs 

Observed  Predicted 


Bv 

»v 

A1 

26 

45 

A2 

67 

63 

A3 

54 

62 

A4 

41 

47 

A5 

79 

121 

Bl 

105 

66 

B2 

95 

78 

Cl 

239 

221 

C2 

69 

105 

C3 

55 

33 

C4 

27 

20 

C5 

50 

70 

C6 

48 

52 

D1 

87 

180 

D2 

13 

20 

El 

144 

136 

FI 

4 

10 

F2 

8 

24 

F3 

8 

28 

F  4 

30 

32 

F5 

30 

58 

G1 

238 

241 

G2 

22 

29 

HI 

1 

7 

H2 

466 

406 

Coef.  of  Corr.  r  =  0.962 

The  agreement  between  observation  and 
theory  is  such  that  all  but  7^  per  cent 
(1-ra)  of  the  variance  is  explained  by  it. 
But  because  the  error  or  "problem  report" 
data  in  {LIT77]  were  available  both  in  the 


25  functional  groupings  above,  and  also  in 
terms  of  some  250  individual  procedures 
[THA76]  making  up  these  25  functional 
groups,  Ottenstein  was  able  to  test  facet 
of  equation  (3) .  According  to  its  deri¬ 
vation,  while  the  number  of  errors  must 
increase  with  the  size  of  a  program,  the 
rate  at  which  they  are  made  must  decrease 
as  the  programmer  gains  familiarity  with 
the  program  on  which  he  is  working.  Con¬ 
sequently,  if  we  considered  two  subproce¬ 
dures  of  equal  volumes,  we  should  expect 
different  error  rates  depending  upon  the 
order  in  which  they  were  written.  For 
the  total  function,  only  the  total  volume 
would  be  involved,  but  for  a  sub-function, 
its  implementation  order  should  contribute 
to  its  variance.  The  theory  therefore 
predicts  that  the  correlation  between 
Bv  and  Bv  for  the  250  sub-procedures 
should  not  be  near  one,  but  near  the 
square  root  of  one-half  or  0.71.  Per¬ 
forming  the  analysis,  she  found  a  value  of 
r=0.756,  rather  than  the  r=0.962  for  the 
functionally  grouped  units. 

This  gives  further  credence  to  her 
derived  expressions  for  the  average  time 
required  to  find  and  correct  one  "valid¬ 
ation  bug"  , 

I  1  =  J.  X  3000  /  4 \ 

and  for  tne  average  number  of  required 
runs  per  day  per  programmer  during  valid¬ 
ation 

Rv/day  =  «8SL  (5) 

where  L  is  the  implementation  level  and  S 
is  the  Stroud  Number. 

Operator  Frequency  Distributions 


A  number  of  research  workers  have 
been  intrigued  by  the  uniform  pattern 
observed  whenever  the  frequency  of  occur¬ 
rence  of  individual  operators  in  a 
program  are  rank-ordered,  and  a  series  of 
papers  have  resulted  from  their  attempts 
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to  derive  a  mathematical  expression  for 
the  observed  frequency  distribution.  One 
of  several  motivations  for  this  work  lies 
in  the  possibility  that  the  results  would 
be  useful  in  resolving  ambiguities  in  the 
classification  of  operators  in  new 
languages . 

In  a  recently  submitted  paper, 

Zweben  [ZHE78]  has  demonstrated  that  the 
relationship  obeys  the  equation 

i  =  Xolog2  (Xo/XjJ  (6) 

where  i  is  the  rank -order,  starting  from 
zero  for  the  most  frequently  occurring 
operator,  and 

=  log2fi  (7) 

where  f^  is  the  number  of  occurrences  of 
the  i  th  most  frequent  operator.  Since 
the  total  number  of  operators  Ni  must  be 
given  by 


m-1 

N,  =  I  fA  (8) 

i=0 

it  is  possible  to  solve  for  the  expected 
number  of  occurrences  f^  for  any  value  of 
i,  given  >nly  n i  and  N, .  For  34  PL/I 
programs,  they  report  a  mean  correlation 
coefficient  of  0.989,  and  a  line  of  re¬ 
gression  with  a  slope  of  0.984  and  an  in¬ 
tercept  of  -0.28. 


Missing  Data 


There  are  still  more  areas  in  software 
science  needing  study  than  there  are 
presently  being  investigated.  Basic  issues, 
such  as  the  role  of  modularity,  application 
to  top-down  design,  and  to  learning  theory 
may  yet  require  years  of  effort.  But  m 
considering  the  specific  field  of  Life 
Cycle  Management,  it  is  easy  to  note  that 
one  simple  tool  is  missing.  No  one  has 
yet  implemented  a  software  parameter 
analyzer  for  Cobol.  Consequently,  we  have 
mean  values,  and  variances,  of  the 
language  levels  for  Fortran,  PL/I,  Algol 


and  others,  but  not  for  Cobol. 

References 

1ELS78]  Elshoff,  James  L.,  "An  Inves¬ 
tigation  into  the  Effects  of 
Counting  Methods  Used  on  Soft¬ 
ware  Science  Measurements",  ACM 
SIGPLAN  Notices,  Vol.  13,  No.  2, 
February  1978,  pp.  30-45. 

[HAL77]  Halstead,  M.  H.,  Elements  of 

Software  Science,  Elsevier  North 
Holland,  New  York,  1977. 

[HAL77b]  Halstead,  M.  H.,  "Potential 

Impacts  of  Software  Science  on 
Software  Life  Cycle  Management", 
in  Software  Phenomenology,  U.S. 
Army  Institute  for  Research  in 
Management  Information  and 
Computer  Science,  August  1977, 
pp.  385-400. 

[LIT77]  Lipow,  M.,  and  T.  A.  Thayer, 

"Prediction  of  Software  Failures 
Proceedings  of  the  1977  Annual 
Reliability  and  Maintainability 
Symposium,  January  18-20,  1977. 

[OTT78]  Ottenstein,  Linda  M. ,  "Predict¬ 
ing  Parameters  of  the  Software 
Validation  Effort",  Ph.D.  Thesis 
Purdue  University,  August  1978. 

{THA75 ]  Thayer,  T.  A.,  "Understanding 
Software  through  Empirical  Re¬ 
liability  Analysis",  AFIPS  Con¬ 
ference  Proceedings,  Vol.  44, 
1975,  pp  335-341 . 

(WAF77]  Walston,  C.  E.  and  C.  P.  Felix, 
"A  Method  of  Programming  Measure 
ment  and  Estimation" ,  IBM 
Systems  Journal,  Vol.  16,  No.  1, 
pp.  54-73. 

[WAF77b]  Walston,  C.  E.  and  C.P.  Felix, 


179 


"on  Lines  of  Code  and  Programm- 
Productivity,"  IBM  Systems 
Journal,  Vol.  16,  No.  4,  1977, 
pp.  421-423. 

(W00781  Woodfield,  Scott,  "An  Experiment 
on  Unit  Increase  in  Algorithm 
Complexity" ,  Submitted 

[ZHE78]  Zweben,  S.  H.,  M.  H.  Halstead 

and  James  L.  Elshoff,  "The  Fre¬ 
quency  Distribution  of  Operators 
in  PL/I  Programs"  Submitted. 


180 


COST  EFFECTIVENESS  IN  SOFTWARE  ERROR  ANALYSIS  SYSTEMS 


Mary  Anne  Herndon 


Herndon  Science  &  Software 
San  Diego,  C A 


ABSTRACT 

Software  error  analysis  systems  must  have  the 
capability  of  functioning  as  both  a  cost  effective 
and  valuable  managerial  tool.  To  achieve  this  capa¬ 
bility,  the  design  of  the  data  collection  must  re¬ 
flect  the  individual  project’s  managerial  concerns, 
and  the  resulting  empirical  analysis  should  be 
available  for  long  term  access. 


INTRODUCTION 

The  implementation  of  software  error  analysis 
systems  has  resulted  in  enormous  expenditures  for 
data  collection  and  analysis.  Although  many  re¬ 
searchers  have  reported  empirical  analyses  of  these 
data  (1,2,3),  few  important  managerial  problems 
have  been  addressed  and  solved.  Researchers  have 
analyzed  the  data  to  provide  descriptors,  such  as 
estimates  of  the  mean  time  between  failures,  and, 
to  a  limited  extent,  error  classifications.  Ef¬ 
fective  project  management,  however,  demands  more 
information  than,  for  example,  an  estimate  of  mean 
time  between  failures,  or  a  tally  of  the  number  of 
register  misassignments .  As  yet,  neither  the  ex¬ 
pense  of  collecting  software  error  data  nor  the 
personnel  difficulties  encountered  have  been  justi¬ 
fied  by  the  usefulness  of  the  information  obtained. 

There  is  a  variety  of  ways  to  collect  immense 
amounts  of  different  types  of  software  error  data. 
In  response  to  key  project  managerial  demands,  how¬ 
ever,  relevant  data  are  currently  being  ignored. 

In  order  to  make  software  error  analysis  systems 
both  a  valuable  and  cost  effective  managerial  tool, 
attention  needs  to  be  focused  on  the  relevant 
issues  in  project  management. 

DATA  COLLECTION  DESIGN 

The  selection  of  data  to  be  collected  during 
validation  should  be  tailored  to  the  management 
goals  of  each  project.  Key  management  personnel, 
as  well  as  in-house  reliability  personnel,  should 
be  consulted  in  the  selection  of  data  items.  Al¬ 
though  each  project  requires  different  types  of 
information,  still,  there  are  managerial  concerns 
that  are  crucial  to  all  types  of  software  projects. 
The  first  concern  centers  around  the  utilization  of 
expenditures  allocated  for  validation.  It  is  a 
well  known  fact  that  a  large  portion  of  total  pro¬ 
ject  expenditures  go  into  formal  validation,  even 
as  much  as  30%  to  75%,  depending  upon  the  type  of 


project.  However,  the  accounting  of  total  expendi¬ 
tures  is  usually  not  performed  until  test  comple¬ 
tion.  A  dally  or  session  accounting  of  expendi¬ 
tures  provides  for  more  effective  monitoring  of  the 
progress  of  validation  than  a  final  accounting. 

In  designing  the  types  of  information  for  monitor¬ 
ing  the  apportionment  of  expenditures,  the  follow¬ 
ing  items  are  quite  useful  in  current  and  predic¬ 
tive  measures  of  expenditure  utilization. 

1.  A  cost  categorization  for  errors,  such  as  pro¬ 
posed  by  Herndon  and  Keenan  (4)., 

2.  A  tallying  of  cost  types  of  errors  detected 
during  each  test  session. 

From  this  minimal  amount  of  data,  managers  are 
able  to  obtain  intuitive  information  on  the  cost 
effectiveness  and  error  detection  effectiveness  of 
the  validation  procedure. 

The  second  managerial  concern  centers  around 
providing  information  describing  the  dynamics  of 
the  functioning  system  during  validation.  Compli¬ 
cated  software  systems,  such  as  exemplified  by 
real  time  systems,  need  to  be  understood  in  terms 
of  module  interactions.  Although  modern  program 
design  philosophy  attempts  to  minimize  the  extent 
such  interactions,  in  the  design  of  actual  systems 
this  is  not  always  achieved.  Consequently,  errors 
that  appear  are  not  always  localized  to  the  module 
of  occurrence.  This  type  of  information  provides 
useful  diagnostic  information  in  the  event  of  ser¬ 
ious  operational  problems.  In  one  empirical  study 
of  a  real  time  system  (5) ,  a  seemingly  innocous 
module  contained  an  inordinate  number  of  high  im¬ 
pact  errors.  The  following  minimal  set  of  data  is 
suggested  to  obtain  system  interaction  profiles. 

1.  Error  impact  classif ication  categories,  such 
as  those  proposed  by  Herndon,  ot  al. 

2.  Profiles  of  each  system  component  for  each 
category . 

3.  Calculation  of  an  average  impact  for  each 
module. 

Observations  from  this  empirical  analysis  can 
indicate  potentially  troublesome  or  unstable  area 
of  the  system.  For  example,  if  one  particular 
module  seems  to  be  associated  with  a  large  number 
of  high  impact  failures,  redesign  of  the  module 
might  be  advantageous  before  the  product  is 
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released.  With  the  proper  identification  of  un¬ 
stable  modules,  appropriate  fault  tolerant  mea¬ 
sures  can  be  used  for  compensation. 

RECOMMENDATIONS 


In  surveying  the  current  state  of  the  art  of 
software  error  analysis  systems,  two  glaring  de¬ 
ficiencies  are  evident.  The  first  deficiency  con¬ 
cerns  the  choice  of  data  that  is  selected  for  col¬ 
lection.  Software  error  analysis  systems  must 
function  as  a  cost  effective  managerial  tool  in 
order  to  compensate  for  the  overhead  that  is  as¬ 
sociated  with  data  collection.  As  of  now,  there 
is  a  distinct  impression  of  quantity  of  data  items 
as  opposed  to  relevancy .  Cost  effectiveness  in 
software  error  analysis  systems  can  only  be 
achieved  by  the  proper  design  of  a  minimal  set  of 
information  to  answer  specific  managerial  ques¬ 
tions  . 

The  second  deficiency  concerns  the  short¬ 
lived  usefulness  and  availability  of  the  data 
after  the  product  is  released.  The  empirical 
analysis  of  the  reliability  data  that  is  obtained 
during  validation  provides  an  initial  system 
reliability  profile.  As  the  product  ages  and 
requires  alterations,  the  information  gained  from 
validation  is  crucial  for  effective  field  manage¬ 
ment  . 


REFERENCES 

(1)  J.D.  Musa,  A  theory  of  software  reliability 
and  its  application,"  IEEE  Trans,  on  Soft¬ 
ware  Engineering,  Vol.  SE-I,  No.  3,  September 

1975,  pp.  312-327. 

(2)  M.  Shooman,  "Structural  models  for  software 
reliability  prediction,"  in  Proc.  2nd  Inter¬ 
national  Conference  on  Software  Engineering, 
San  Francisco,  Oct.  1976,  pp.  268-280. 

(3)  H.  Hecht,  "Reliability  Measurement  During 
Software  Engineering,  San  Francisco,  Oct. 

1976,  pp.  268-280. 

(A)  M.A.  Herndon  and  J.A.  Lane,  "Analysis  of 

Software  Errors  for  Cost  Factors,"  Proceed¬ 
ings  of  1977  AIAA  Computers  of  Aerospace 
Conference,  Los  Anges,  CA,  pp.  455-459 

(5)  M.A.  Herndon  and  A.P.  Keenan,  "Analysis  of 
Error  Remediation  Expenditures  During  Valid¬ 
ation,"  Proceedings  of  the  3rd  International 
Conference  on  Software  Engineering,  Atlanta, 
GA,  1978,  pp.  202-206. 


182 


STATISTICAL  TECHNIQUES  FOR  COMPARISON  OF  COMPUTER  PERFORMANCE 


Sandra  A.  Mamrak 
The  Ohio  State  University 
Columbus,  Ohio  43210 


Abstract 

The  comparison  of  computer  performance  requires 
a  methodology  designed  to  lead  to  the  selection 
of  the  best  computer  and  to  provide  control  of 
the  probability  of  having  made  a  correct  choice. 
Methodologies  often  used  in  classical  statisical 
designs  lead  to  regression  analyses  of  the  data, 
employing  either  analysis  of  variance  or  curve¬ 
fitting  techniques.  The  questions  that  can  be 
answered  using  such  methodologies  are  of  the 
type  "Is  the  performance  of  several  alternative 
systems  the  same  (are  the  distributions  of  per¬ 
formance  measurements  identical  from  a  statisi¬ 
cal  point  of  view)?"  or  "How  does  one  particular 
system  performance  parameter  depend  upon  the 
other  system  parameters?".  In  most  computer 
comparison  efforts,  however,  these  questions  are 
not  appropriate.  The  question  of  real  interest 
is:  "Which  system  is  the  best?"  or  "How  do  the 
systems  rank  from  best  to  worst?"  It  is  pre¬ 
cisely  for  this  type  of  problem  that  statisti¬ 
cal  ranking  and  selection  procedures  were 
developed.  These  procedures  are  applied  in  the 
selection  methodology  outlined  below. 


A  Computer  Selection  Model 

A  model  of  the  way  computer  selection  is  typically 
done  focuses  the  main  components  of  the  computer 
selection  process  and  provides  a  framework  for 
this  paper.  The  model  is  presented  in  Figure  1 
and  discussed  in  the  next  two  sections. 

Classification  of  Performance  Criteria 

In  choosing  the  best  alternative  from  several 
alternatives,  criteria  must  be  defined  which 
state  what  is  meant  by  best.  These  criteria  can 
be  categorized  in  two  ways.  They  are  either 
measurable  or  nonmeasurable,  and  they  are  either 
mandatory  or  desirable.  Thus,  the  selection 
criteria  can  be  classified  as  Mandatory  Non¬ 
measurable  (MN),  Mandatory  Measurable  (MM), 
Desirable  Nonmeasurable  (ON),  and  Desirable 
Measurable  (DM).  Examples  of  each  type  of 
criterion  are  provided  in  Table  1. 

Application  of  Performance  Criteria 

Figure  1  illustrates  a  sequence  in  which  the 
classes  of  selection  criteria  are  applied  in  the 
process  of  choosing  the  best  alternative  system. 


This  sequence  is  composed  of  three  phases.  The 
application  of  MN  criteria  in  phase  1  is  easily 
managed,  since  each  alternative  either  does  or 
does  not  have  the  required  characteristic. 

Phase  II  involves  the  application  of  MM  criteria. 
Experiments  are  conducted  on  the  alternatives 
which  survived  Phase  I  and  the  performance  of  each 
is  documented.  In  general,  for  each  MM  criterion, 
measurements  are  gathered  from  every  system  and  a 
decision  is  made  as  to  whether  or  not  the  criteria 
are  satisfied.  Failure  to  satisfy  a  single  MM 
criterion  results  in  an  alterntiave’s  elimination. 
At  the  conclusion  of  Phase  II,  the  number  of 
alternatives  is  usually  reduced  from  that  of  Phase 
I. 

Finally,  determination  of  the  best  alternative  is 
made  in  Phase  III.  This  stage  is  separated  into 
two  parts.  Phase  IIIA  for  the  application  of  DN 
criteria  and  stage  1 1 1 B  for  the  application  of 
DM  criteria.  For  DM  criteria,  data  are  collected 
from  each  alternative  being  evaluated  and  com¬ 
pared.  On  the  basis  of  relative  performance  with 
respect  to  the  DN  and  DM  criteria,  an  alternative 
is  selected  as  the  best.  In  both  Phases  II  and 
1 1  IB ,  comparison  requires  collecting  and  analyz¬ 
ing  relevant  performance  measurements  for  the 
various  computer  alternatives  under  consideration: 
in  Phase  II  to  select  those  satisfying  certain 
mandatory  performance  standards  and  in  Phase  II IB, 
to  select  the  best  remaining  one. 

A  selection  process  requires  a  methodology  to  lead 
to  the  selection  of  better  than  standard  alterna¬ 
tives  and  selection  of  the  best  alternative  and  to 
simultaneously  provide  control  of  the  probability 
of  having  made  correct  choices.  This  work  focuses 
its  attention  on  the  specification  of  a  methodology 
for  those  problems  which  occur  in  Phase  II  and 
IIIB,  thereby  leading  to  a  more  comprehensive, 
scientific  methodology  for  computer  selection. 

A  Computer  Selecti on  Methodology 

A  good  experimental  design  is  a  critical  component 
of  any  comparison  methodology,  since  the  efficiency 
of  the  data  collection  process  and  the  validity  of 
the  data  analysis  depend  upon  it.  Ranking  and 
selection  procedures  (see  [KI.E751  or  [ c: I B 7 7 ]  for  a 
survey  of  these  techniques)  provide  an  appropriate 
experimental  design  for  computer  selection.  These 
procedures  can  be  roughly  characterized  as  follow¬ 
ing  three  lines  of  development:  one  set  of  pro¬ 
cedures  ranks  systems  by  comparing  sample  means. 


Table  1.  EXAMPLE  OF  PERFORMANCE  CRITERIA 


Type  Example 

Mandatory  1.  The  system  must  be  fully  delivered  and  operational 

Nonmeasurable  no  later  than  September  1,  1979. 

2.  Timesharing  service  must  include  FORTRAN,  Basic,  Lisp, 
_ SNOBOL  and  editing  facilities. _ _ _ _ _ _ _ 

Mandatory  1.  The  mean-time-to-failure  for  a  specific  one  month 

Measurable  period  must  be  greater  than  4  hours. 

2.  95%  of  all  trivial  command  response  times  must  be 

_ _ less  than  1  second.  _ _ _ 

Desirable  1.  It  is  desirable  that  the  system  include  Pascal  and 

Nonmeasurable  COBOL  facilities. 

2_, It  is  desired  that  the  system  cost  less  than  $100,000. _ 

1.  It  is  desired  that  the  system  provide  as  fast  a  mean 
turnaround  time  as  possible  for  the  benchmark  run. 

2.  It  is  desired  that  response  time  means  as  well  as 

variations  bn  small. _ _  _ _ __ _ 


Desirable 

Measurable 


one  by  comparing  sample  percentiles  and  one  by 
comparing  sample  proportions.  In  all  three  cases, 
the  procedures  specify  the  number  of  data  points 
which  must  be  collected  from  each  system  in  a 
comparison  study  in  order  to  guarantee  that  the 
probability  of  a  correct  selection  be  greater  than 
or  equal  to  a  predetermined  minimum  value. 


The  use  of  a  mean,  percentile  or  proportion 
statistic  for  system  comparison  is  an  analyst 
choice  based  on  considerations  about  the  objectives 
of  a  comparison  experiment,  the  statistical 
properties  of  the  data  and  the  statistical  require¬ 
ments  of  the  data  analysis  techniques.  Means  are 
often  used  for  comparisons  when  criteria  like 
script  turnaround  time  or  script  cost  are  of 
interest.  For  comparison  criteria  such  as 
response  time,  which  tend  to  have  exponenti al -1  ike 
distributions,  the  mean  is  not  as  meaningful  a 
statistic.  Percentages  or  proportions  are  more 
appropriate. 


The  percentile  and  proportion  approaches  to  com¬ 
parison  are  very  similar  in  that  they  both  rely 
on  a  single  comparison  criterion's  cumulative 
distribution.  The  difference  lies  in  whether  an 
analyst  prespecifies  a  desired  percentage  value 
or  a  desired  comparison  criterion  value.  In  a 
comparison  based  on  percentiles,  a  percentage  is 
predetermined.  Results  are  produced  of  the  form: 

"if  computer  service  A  has  90"  of  its  response 
times  less  than  3  seconds,  and  computer  service  B 
has  90%  of  its  response  times  less  than  3.5 
seconds,  then  rank  A  as  being  better  than  B," 

where  "901"  is  prespecified  by  the  analyst.  In  a 
comparison  based  on  proportions,  a  value  of  a 
comparison  criterion  is  prespecified.  Results 
are  produced  of  the  form: 

"if  computer  service  A  has  80%  of  its  response 
times  less  than  3  seconds,  and  computer  service  B 
has  87%  of  its  response  times  less  than  3  seconds. 
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then  rank  B  as  being  better  than  A," 

where  "3  seconds"  is  prespecified  by  the  analyst. 

Selection  of  Systems  Better  than  a  Standard 

In  the  case  of  selecting  those  computer  systems 
which  are  better  than  a  standard  (Phase  II  in 
Figure  1),  an  experimental  design  is  required 
which  leads  to  an  analysis  of  the  data  that 
answers  the  question  "Which  services  are  at 
least  as  good  as  a  prespecified  standard?".  The 
ranking  and  selection  techniques  which  have  been 
developed  for  selection  better  than  a  standard 
are  not  appropriate  for  computer  selection  when 
performance  indices  are  being  compared  at  their 
mean  or  percentile  values.  The  procedures 
make  assumptions  about  the  data  that  are  clearly 
not  justified  in  computer  selection  experiments 
(such  as  equal  variance  for  all  systems).  But, 
an  appropriate  procedure  does  exist  when  pro- 
Jortions  are  the  basis  for  a  selection  (see 
[ MAM78  J ) . 

Selection  of  the  Best  System 

In  the  case  of  selecting  the  best  computer 
service  (Phase  III  in  Figure  1),  an  experimental 
design  is  required  which  leads  to  an  analysis  of 
the  data  that  answers  the  question  "Which  system 
is  the  best  one?".  In  this  case  ranking  and 
selection  techniques  exist  for  choosing  systems 
based  on  mean  or  percentile  values  (see  [AMF.78b| 
and  [ mam 77]),  and  also  based  on  proportions  (see 
|AMF78a]  and  [MAM78 ] ) . 

A  Feasibility  Study 

A  large  scale  feasibility  case  study  is  underway 
to  evaluate  the  time  and  cost  required  to  apply  • 
the  computer  service  comparison  methodology  in  an 
actual  procurement  environment.  Four  heterogeneous, 
remote-access,  time  sharing  services  are  being 
compared.  The  specifications  for  the  case  study 
and  the  experimental  results  are  presented  in 
[MAM78] . 
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This  paper  describes  a  graph-theoretic 
software  complexity  measure,  and  describes 
how  it  can  be  applied  to  limit  the  logic  in 
a  module  during  the  design  stage  so  it  is 
testable  and  maintainable  during  later 
stages . 


The  process  of  software  construction 
relies  in  a  fundamental  sense  on  the  siz¬ 
ing  of  subsystems  and  modules  that  are 
separately  designed,  coded  and  tested.  One 
concept  very  sorely  needed  is  a  way  to  mea¬ 
sure  and  control  software  complexity  at  che 
design  stage  in  such  a  fashion  that  the 
modules  are  subsequently  manageable,  i.e., 
comprehensible,  and  also  testable.  The 
approach  of  this  research  is  to  develop  a 
concrete,  applicable,  but  yet  mathematically 
based  measurement  of  software  complexity. 
The  measure  presented  in  this  paper  has  a 
mathematical  basis  in  the  sense  that  it  is 
not  dependent  on  the  application  a  module 
is  used  for,  the  language  used,  or  the  pro¬ 
gramming  style — it  measures  purely  the  com¬ 
plexity  of  internal  logic  of  a  module.  The 
measure  presented  is  also  immediately  ap¬ 
plicable  because  it  is  readily  understood 
by  the  programming  community  and  it  direc¬ 
tly  mapps  into  a  rigorous  testing  method¬ 
ology  and  makes  maintenance  feasible. 

The  complexity  measure  decided  upon 
will  limit  the  number  of  independent  paths 
in  a  program  at  the  design  stage  so  the 
testing  will  be  manageable  during  later 
stages.  One  of  the  reasons  for  limiting 
independent  paths  instead  of  limiting  all 
potential  paths  is  the  following  dilemma: 

"A  relatively  simple  program  can  have  an 
arbitrary  high  number  of  paths."  The  ap¬ 
proach  taken  here  is  to  limit  the  number  of 
basic  (or  independent)  paths  that  when  taken 
in  combination  will  generate  all  paths. 

Definition 

One  definition  and  one  theorem  from 
graph  theory  will  be  needed  to  develop 
these  concepts.  See  Berge  (1)  for  a  refe¬ 
rence  . 

Definition  1 

The  cyclomatic  number  v(G)  of  a  graph 


G  within  n  vertices,  e  edges  and  p  con¬ 
nected  components  is  v(G)=e-n+p. 

Theorem  1 

In  a  strongly  connected  graph  G,  the 
cyclomatic  number  is  equal  to  the  maximum 
number  of  linearly  independent  circuits. 

The  application  to  computer  programs 
will  be  made  as  follows.  Given  a  program 
associate  with  it  a  graph  that  has  unique 
entry  and  exit  nodes.  Each  node  in  the 
graph  corresponds  to  a  block  of  statements 
where  the  flow  is  sequential  and  the  arcs 
represent  the  program's  branches  taken 
between  blocks.  This  graph  is  classically 
known  as  the  control  graph  (see  Ledyard 
(12])  and  it  is  assumed  that  each  node  can 
be  reached  by  the  entry  node  and  each  node 
can  reach  the  exit. 

For  e  .ample  the  control  graph  shown 
below  has  seven  blocks  (a)  through  (g) , 
entry  and  exit  nodes  (a)  and  (g) ,  and  ten 
arcs . 


In  order  to  apply  Theorem  1  the  graph 
must  be  strongly  connected  which  means 
that  given  two  nodes  (a)  and  (b)  there 
exists  a  path  from  (a)  to  (b)  and  a  path 
from  (b)  to  (a) .  To  satisfy  this  we  ass- 
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Basis 


ociate  an  additional  edge  with  the  graph 
which  branches  from  the  exit  node  (g)  to 
the  entry  node  (a)  as  shown  below.  123456789  10 


Theorem  1  now  applies  and  it  states 
that  the  maximal  number  of  independent 
circuits  in  G*  is  11-7+1  (G  has  only  one 
connected  component  so  we  set  p  equal  to 
1.  The  generalized  case  where  p> 1  is  used 
for  design  complexity,  (see  McCabe  til)). 
The  implication  therefore  is  that  there  is 
a  basis  set  of  5  independent  paths  that 
when  taken  in  combination  will  generate  all 
paths.  For  example,  the  set  of  5  paths 
shown  below  form  a  basis. 


bl  : 

abeg 

b2 : 

a  (be)  2 

b3 : 

abefg 

b4 : 

adefg 

b5 : 

adf  g 

If  one  chooses  any  arbitrary  path  it 
should  be  equal  to  a  linear  combination  of 
the  basis  paths  bl-b5.  For  example,  the 
path  abcbefg  is  equal  to  b2+b3-bl,  and  path 
a(bc)3g  equals  2*b2-bl.  To  see  this  it  is 
necessary  to  number  the  edges  in  G  and  show 


bl  1001000010 
b2  1012000010 
b3  1000100101 
b4  0100010101 
b5  0100001001 


abcdefg  1011100101 
a (be) g  1023000010 

The  path  abcbefg  is  represented  as  the 
edge  vector  shown  above  and  it  is  equal  to 
b2+b3-bl  where  the  addition  is  done  com¬ 
ponent-wise.  In  similar  fashion  the  path 
a(bc)Jg  shown  above  is  equal  to  2*b2-bl. 

It  is  important  to  notice  that  Theorem 
l  states  that  G  has  a  basis  set  of  size  5 
but  it  does  not  tell  us  which  particular 
set  of  5  paths  to  choose.  For  example, 
the  following  set  will  also  form  a  basis. 

adfh 

abefg 

adefg 

a (be) 3befg 
a  (be)  4g 

When  this  is  applied  to  testing  the 
actual  set  of  5  paths  used  will  be  dictated 
by  the  data  conditions  at  the  various  de¬ 
cisions  in  the  program  but  the  theorem 
guarantees  that  we  will  always  be  able  to 
find  a  set  of  5. 


It  should  be  emphasized  that  the 
process  of  adding  the  extra  edge  to  G 
was  only  to  make  the  graph  strongly  con¬ 
nected  so  Theorem  1  would  apply.  When 
calculating  the  complexity  of  a  program 
or  testing  the  program  the  extra  edge  is 
not  an  issue  but  rather  it  is  reflected 
in  the  expression  used  for  complexity. 

The  complexity  v,  therefore,  is  defined  as 
v=e-n+2p  since  an  extra  edge  is  added  for 
each  component.  In  each  of  the  examples 
discussed  in  this  paper  there  is  only  one 
component  so  the  complexity  expression 
simplifies  to  v=e-n+2. 


Examples 


Several  actual  control  graphs  and 
their  complexity  will  be  presented  in 
order  to  illustrate  these  concepts.  These 
graphs  came  from  FORTRAN  programs  on  a 
PDP-10.  The  programs  were  analyzed  by  an 
automated  system  FLOW  that  recognizes  the 
blocks  and  transitions  in  a  FORTRAN 


I 


The  graphs  above  were  presented  in 
order  of  increasing  complexity  in  order  to 
suggest  the  relationship  between  the  com¬ 
plexity  numbers  and  our  intuitive  notion 
of  the  complexity  of  the  graphs.  One  es¬ 
sential  ingredient  in  any  testing  method¬ 
ology  is  to  limit  the  program  logic  during 
development  in  order  that,  first,  the  pro¬ 
gram  can  be  understood,  and  second,  the 
amount  of  testing  required  to  verify  the 
logic  is  not  overwhelming.  These  tech¬ 
niques  are  being  presented  in  The  Institute 
for  Advanced  Technology's  Structured 
Testing  seminars  and  experience  has  consis 
tently  shown  that  programs  with  high  cyclo- 
matic  complexity  are  difficult  to  understand 
and  are  rarely  tested  adequately.  For 
example,  according  to  its  author  the  pro¬ 
gram  below  is  'one  of  my  better  programs' 
and  it  required  only  'about  four  or  five 
tests  to  verify' . 


4.  J.B.  Goodenough  and  S.L.  Gerhart, 
"Toward  a  Theory  of  Test  Data  Selection", 
Proc,  1975  International  Conf.  on  Reli¬ 
able  Software,  Los  Angeles,  California, 
pp.  493-510. 

5.  W.C.  Hetzel,  "Program  Test  Methods 
Prentice  Hall,  Inc.",  Englewood  Cliffs, 

New  Jersey,  1973. 

6.  W.E.  Howden,  "Symbolic  Testing 
and  the  Dissect  Symbolic  Evaluation  Sys¬ 
tem",  IEEE  Trans,  on  Software  Engineering 
Vol .  SE-3  No.  4,  pp.  266-278,  July  1977. 

7.  J.C.King,  "A  New  Approach  to 
Program  Testing",  Proc.,  1975  Interna¬ 
tional  Conf.  on  Reliable  Software,  Los 
Angeles,  California,  pp.  228-233. 

8.  D.E.Knuth,  "Structured  Program- 

The  physical  size  of  the  program  this  ming  with  GOTO  Statements”,  Computing 

graph  is  derived  from  is  only  70  lines  of  Surveys,  vol.  6,  pp.  261-301,  Dec.  1974. 

source  code.  The  physical  size  of  several 

of  the  twelve  previous  graphs  exceeded  70  9.  R.  Kosaraju,  "Analysis  of 

lines,  and,  in  fact,  only  a  weak  correla-  Structured  Programs",  J.  Comput.  Syst. 

tion  between  physical  size  and  complexity  Scil,  vol.  9,  pp.  232-255,  Dec  1974; 

has  been  found.  Because  of  this,  the  com-  also  Dep.  Elec  Eng.,  The  Johns  Hopkins 

mon  practice  of  attempting  to  limit  com-  Univ. ,  Baltimore,  Md. ,  Tech.  Rep.  72:11, 

plexity  by  only  controlling  how  many  pages  1972. 

a  routine  will  occupy  is  entirely  inade¬ 
quate.  This  complexity  measure  has  been  10.  H.  Legard  and  M.  Marcotty,  "A 

used  in  production  environments  by  limit-  Generalogy  of  Control  Structures” 

ing  the  complexity  of  every  module  to  10.  Commun.  Assoc.  Comput.  Mach.,  Vol’.  18, 

Programmers  have  been  required  to  calcul-  pp.  629-639,  Nov.  1975. 

ate  the  complexity  as  they  develop  rou¬ 
tines,  and  if  it  exceeds  10  they  are  11.  T.J.  McCabe,  "A  Complexity 

required  to  recognize  and  modularize  sub-  Measure",  IEEE  Trans,  on  Software  Engi- 

functions  or  re-design  the  software.  The  neering,  Vol.  SE-2  No.  4,  pp.  308-320, 

only  situation  where  the  limit  of  10  seared  Dec  1976. 

unreasonable  and  an  exception  was  allowed 

is  in  a  large  CASE  statement  where  a  num-  12.  E.F.  Miller,  "Program  Testing: 

ber  of  independent  blocks  followed  a  sel-  Art  Meets  Theory",  Computer,  Vol.  10, 

ection  function.  No.  7,  op.  42-51,  July  1977. 

See  McCabe  [11]  for  further  simpli¬ 
fication  of  this  research.  in  particular, 
reference  [11]  contains  results  that 
allow  a  simplification  of  the  computation 
of  e-n+2p  into  more  intuitive  programming 
terms.  Reference  [11]  also  comments  on 
the  induced  testing  methodology  and  it 
deals  with  the  measurement  of  structured¬ 
ness  of  programming  logic. 
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ABSTRACT 

This  paper  describes  the  utility  of  the  pro¬ 
gressive  application  of  software  quality  metrics 
during  large-scale  software  developments.  The 
concept  of  the  software  quality  metrics  was 
derived  and  validated  during  a  study  supported  by 
the  Air  Force  Systems  Command  Electronic  Systems 
Command  and  Rome  Air  Development  Center.  Fur¬ 
ther  extensions  to  the  concepts  are  currently 
being  supported  by  Rome  Air  Development  Center 
and  the  U.S.  Army  Computer  Systems  Conmand, 
AIRMICS.  The  metrics  provide  a  disciplined, 
engineering  approach  and  life  cycle  management 
viewpoint  to  software  quality  assurance. 


INTRODUCTION 

In  a  study  for  the  Air  Force,  a  program 
management-oriented  view  of  software  quality  was 
established  (Ref  1).  The  concept  of  quality 
derived  was  based  on  three  viewpoints  with  which 
a  program  manager  interacts  with  the  end  product: 
Its  operation,  revision,  and  transition.  These 
viewpoints  correspond  to  the  life  cycle  activities 
of  the  product.  The  factors  of  software  quality 
associated  with  these  three  viewpoints  are  shown 
in  figure  1. 


These  factors  are  used  by  the  program  manager 
to  identify  which  qualities  are  important  to  the 
particular  development  effort.  The  desired  level 
of  quality  can  be  quantitatively  specified  based 
on  the  formal  definitions  established  for  these 
factors. 

Associated  with  each  quality  factor  is  a  set 
of  criteria.  These  criteria  are  attributes  of  the 
software  and  related  documentation  and  their  presence 
provides  the  quality  or  characteristics  implied 
by  the  factor.  Software  metrics  have  been  estab¬ 
lished  which  provide  quantitative  measures  of 
those  attributes  represented  by  the  criteria.  The 
metrics  allow  measurement  of  the  progress  toward 
achieving  desired  levels  of  quality  by  their  appli¬ 
cation  to  intermediate  products  produced  during  a 
large-scale  software  development. 

These  concepts  are  extensions  of  some  signif¬ 
icant  efforts  by  others  in  this  field  (Refs  2,  3, 

4,  5,  6).  We  have  added  a  program  manager's  ori¬ 
entation,  automated  the  collection  of  many  of  the 
metrics,  and  added  quantification  to  measures 
applied  during  the  early  phases  of  a  development. 

The  concept  is  based  on  the  following  facts 
about  quality: 

•  Relative  to  the  application 

•  Impacts  the  cost  to  develop 


MAINTAINABILITY  - 
CAN  I  FIX  IT? 

FLEXIBILITY  - 
CAN  I  CHANGE  IT? 

TESTABILITY  - 
CAN  I  TEST  IT? 


PORTABILITY  -  WILL  I  BE  ABLE  TO  USE  IT 
ON  ANOTHER  MACHINE? 

REUSABILITY  -  WILL  I  BE  ABLE  TO  REUSE 
SOME  OF  THE  SOFTWARE? 

.  .INTEROPERABILITY  -  WILL  I  BE  ABLE  TO 
■"W  INTERFACE  IT  WITH 

% 

\ 

% 


ANOTHER  SYSTEM? 


CORRECTNESS  -  DOES  IT  00  WHAT  I  WANT? 

RELIABILITY  -  DOES  IT  DO  IT  ACCURATELY  ALL  THE  TIME? 
USABILITY  -  CAN  I  RUN  IT? 


EFFECIENCY  -  WILL  IT  RUN  ON  MY  HARDWARE  AS 
WELL  AS  IT  CAN? 


INTEGRITY  -  IS  IT  SECURE? 


Figure  1.  Factors  in  Software  Quality 
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•  Impacts  the  life  cycle  costs 

t  Should  be  customer-defined 

•  Has  an  associated  cost  to  measure  and 
control . 

•  Its  measurement  will  change  as  the  tech¬ 
nologies  of  producing  software  change. 

The  fact  that  there  is  a  trend  toward  more 
structured  disciplined  programming  techniques  and 
environments  reinforces  the  concept  of  software 
qual ity  metrics. 


To  evaluate  the  current  degree  or  level  of  a 
particular  quality  factor,  i,  for  a  module,  j,  the 
particular  column  in  the  measurement  matrix  is 
multiplied  by  the  row  in  the  coefficient  matrix. 
The  resultant  value: 


ci  ,1  mi,j 


m,  ,  +  ci>2 


+  ci,n  Vj  *  ri,j 


is  the  current  predicted  rating  of  that  module,  j, 
for  the  quality  factor,  i. 


The  purpose  of  this  paper  is  to  expand  upon 
the  utility  of  the  periodic  application  (measure¬ 
ment)  of  software  quality  metrics  during  large- 
scale  software  developments. 


The  coefficient  matrix  should  be  relatively  sparce 
(many  c-jj  =  0).  Only  subsets  of  the  entire  set 
of  metrics  applicable  at  any  one  time  relate  to 
the  criteria  of  any  particular  quality  factor. 


A  MEASUREMENT  VEHICLE 


At  appropriate  times  during  a  large-scale 
development,  the  application  of  the  metrics 
results  in  a  matrix  of  measurements.  The  metrics 
that  have  been  established  to  date  are  at  two 
levels  -  system  level  and  module  level.  The 
approach  to  be  described  is  applicable  to  both 
levels  and  will  be  described  in  relationship  to 
the  module  level  metrics. 


A  n  by  k  matrix  of  measurements  results  from 
the  application  of  the  metrics  to  the  existing 
products  of  the  development  (e.g.,  at  design,  the 
products  might  include  review  material,  design 
specifications,  test  plans,  etc.)  where  there  are 
k  modules  and  n  module  level  measurements  appli¬ 
cable  at  this  particular  time. 


"’ll  m12 
m21 


mlk 


"nl 


"nk 


For  that  particular  time  there  is  an  associ¬ 
ated  matrix  of  coefficients  which  represent  the 
results  of  linear  multivariate  regression  analyses 
against  empirical  data  (past  software  develop¬ 
ments);  These  coefficients,  when  multiplied  by 
the  measurement  matrix  results  in  an  evaluation 
(prediption)  of  the  quality  of  the  product  based 
on  the^development  to  date.  This  coefficient 
matrix,  shown  below,  has  n  columns  for  the  coeffi¬ 
cients  of  the  various  metrics  and  11  rows  for  the 
11  quality  factors. 

f  C11  C1 2  •  •  cln  1 


L  Cll,l  cll,n 


Multiplying  the  complete  measurement  matrix 
by  the  coefficient  matrix  results  in  a  ratings 
matrix.  This  matrix  contains  the  current  pre¬ 
dicted  ratings  of  each  module  for  each  quality 
factor. 


r 


l,k 


rll  ,k 


This  approach  represents  the  most  formal 
approach  to  evaluating  the  quality  of  a  product 
utilizing  the  software  quality  metrics.  Because 
the  coefficient  matrix  has  been  developed  only 
for  a  limited  sample  in  a  particular  environment, 
it  is  neither  generally  applicable  nor  has  sta¬ 
tistical  confidence  in  its  values  been  achieved. 


Other  valuable  information  is  available  from 
the  measurement  matrix  with  the  current  state  of 
the  technique. 

For  example,  the  development  of  a  particular 
module  can  be  assessed  by  examining  the  measures 
in  the  appropriate  column  of  the  measurement  ma¬ 
trix.  How  all  the  modules  are  doing  with  respect 
to  any  particular  attribute  can  be  assessed  by 
examining  the  appropriate  row.  Those  particular 
modules  with  excessively  low  scores  should  be 
investigated  further.  This  form  of  sensitivity 
analysis  is  facilitated  by  the  collection  of  the 
metric  data  in  the  form  of  the  measurement  matrix. 

In  examining  a  particular  measure  across  all 
modules,  consistently  low  scores  may  exist.  This 
situation  identifies  the  need  for  a  new  standard 
or  stricter  enforcement  of  existing  standards  to 
improve  the  overall  development  effort. 

As  experience  is  gained  with  the  metrics  and 
data  is  accumulated,  threshold  values  or  industry 
acceptable  limits  may  be  established.  In  addition, 
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comparisons  of  metric  values  with  trends  in  the 
occurrence  of  Design  Problem  Reports  and  Software 
Problem  Reports  may  prove  to  provide  important 
quality  assurance  insight. 

QUALITY  CONTROL  MECHANISM 

The  periodic  application  of  the  metrics  dur¬ 
ing  a  large-scale  software  development  can  be 
viewed  as  a  control  system.  Snapshot  assessments 
are  generated,  feedback  to  program  management  is 
provided  with  respect  to  their  specified  require¬ 
ments  for  quality,  thereby  allowing  corrective 
action,  calibration,  redirection,  or  the  identifi¬ 
cation  of  areas  to  be  emphasized  later  in  the 
development  (e.g.,  testing)  to  be  enacted. 

These  concepts  are  illustrated  in  figure  2. 

It  is  important  to  note,  the  metrics  were  estab¬ 
lished  with  a  goal  of  not  requiring  additional 
products  to  be  generated  during  the  development 
efforts.  Thus,  the  metrics  take  advantage  of 
current  control  mechanisms  (delivered  products 
and  reviews)  normally  utilized  in  a  large  soft¬ 
ware  development. 

EXPERIENCES 

These  concepts  have  been  applied  to  software 
developed  in  a  command  and  control  system 
environment.  This  application  resulted  in  the 
current  coefficient  matrices  and  sets  of  metrics 
to  be  applied  during  a  software  development 
(Ref  7,  8). 


We  are  currently  assessing  the  value  of  these 
concepts  to  the  development  and  modification  of 
support  software.  Also,  we  have  just  begun  an 
effort  with  Air  Force  and  Army  sponsorship  (Ref  9) 
to  evaluate  the  metrics  with  respect  to  an  Army 
management  information  systems  environment.  The 
experience  gained  from  these  efforts  will  provide 
an  excellent  basis  for  discerning  the  extent  to 
which  these  concepts  can  provide  additional  quality 
assurance  discipline  in  large-scale  software 
developments. 

In  effect,  these  concepts  are  an  attempt  to 
provide  a  mechanism  for  the  specification  of  life 
cycle-related  quality  goals  and  assessment  of  the 
progress  toward  those  goals  during  the  early 
development  phases.  The  quantification  provided 
by  the  metrics  affords  more  consistent  evaluation 
of  the  software  quality.  The  overall  goal  in 
these  efforts  is  to  introduce  a  more  disciplined, 
engineering  approach  and  a  life  cycle  management 
viewpoint  to  software  quality  assurance. 
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Effective  software  reliability  evalu¬ 
ation  requires  theories  of  software  relia¬ 
bility  which  define  and  deal  with  software 
reliability  quantitatively,  technologies 
for  reliability  data  measurement  and  data 
analysis,  techniques  to  estimate  or  predict 
software  reliability,  and  practical  relia¬ 
bility  evaluation  methodologies  which 
effectively  reflect  characteristics  of 
software  nature. 

This  paper  assesses  the  extents  to 
which  these  requirements  are  currently  met, 
and  introduces  some  approaches  toward  an 
effective  software  reliability  evaluation. 
Introduced  are  the  methodologies  for  soft¬ 
ware  reliability  evaluation  and  the  soft¬ 
ware  reliability  management-aid  tools. 

1.  INTRODUCTION 

Since  a  software  reliability  became 
the  most  serious  and  important  problem  in 
software  industries,  we  have  already  spent 
more  than  15  years.  During  this  period, 
many  people  tried  to  overcome  this  problem 
vigorously  and  various  kinds  of  approaches 
were  taken  from  all  angles.  Now,  it  is 
broadly  recognized  that  a  software  reliab¬ 
ility  problem  is  clearly  an  integral  port¬ 
ion  of  software  engineering.  However,  as 
known  well,  inspite  of  all  our  efforts, 
the  software  reliability  problem  still 
remains  as  the  knottiest  subject.  This  may 
call  for  next  severe  question.  What  have 
we  done  for  a  long  time,  how  on  earth? 
Clearly,  some  portion  of  the  reliability 
problem  might  be  due  to  evaluation  techni¬ 
ques  for  software  reliability. 

Practically,  an  effective  software 
reliability  evaluation  requires  theoretical 
basis.  Theories  of  reliability  should 
define  software  reliability  and  its  metric 
quantitatively.  Also  needed  are  measurement 
and  analysis  techniques  for  reliability 
data  represented  in  terms  of  the  metric, 
techniques  to  estimate  or  predict  future 
software  reliability  growth  in  testing 
stages,  practical  evaluation  methodologies 
reflecting  software  characteristics  effec¬ 
tively,  and  evaluation-aid  tools. 

From  these  points  of  view,  this  paper 
assesses  previous  works  in  software  relia¬ 
bility  evaluation  briefly,  extracts  major 
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problems,  and  tries  to  give  them  some  solu¬ 
tions  or  improvements.  Proposed  ideas  are 
integrated  as  a  reliability  management-aid 
system  for  an  entire  software  life  cycle. 

Before  we  start  a  discussion,  we  must 
note  some  basic  terms  in  order  to  avoid 
confusion.  Software  error  is  a  defect  that 
causes  a  software  failure.  Software  failure 
is  an  unacceptable  departure  of  software 
operation  from  software  requirements. 
Software  reliability  is  defined  as  a  proba¬ 
bility  of  failure-free  operation  in  a 
specified  environment  for  a  specified  time. 
(This  definition  will  be  changed  later.) 
Reliability  growth  is  defined  as  the  incre¬ 
asing  probability  of  software  to  perform 
required  functions  under  stated  conditions 
during  stated  time  interval.  Software 
reliability  model  mainly  refers  to  a  mathe¬ 
matical  model  constructed  for  a  purpose  of 
assessing  the  reliability  of  software  from 
specified  parameters  which  are  either 
assumed  known  or  are  measured  from  observa¬ 
tions  or  experiments  on  software. 

2.  SOME  BASES  FOR  SOFTWARE  RELIABILITY 

Firstly,  let  us  review  previous  works 
in  software  reliability  evaluation. 

We  can  easily  find  out  two  major  dire¬ 
ctions  in  studies  done  by  predecessors. 
Namely,  they  are  1)  a  direction  to  study 
reliability  characteristics  of  software 
itself,  and  2)  a  direction  to  study 
software  reliability  evaluation  technique. 
Let  us  look  into  them  in  turn. 

2 . 1  Nature  of  Software  Reliability 

The  first  direction  of  studies  aims  to 
find  out  fundamental  elements  of  software 
which  affect  software  reliability,  and  to 
make  clear  causal  relationships  between 
these  reliability  elements  and  software 
development  technologies.  In  other  words, 
this  is  to  analyze  the  elements  of  software 
quality  and  data  on  when,  where,  how,  and 
why  people  make  software  errors.  Also 
intended  as  a  final  object  of  this  directi¬ 
on  is  to  establish  a  set  of  effective  and 
reliable  software  development  techniques 
and  methodologies. 


THIS  PAPER  IS  PARTIALLY  PUBLISHED  AT  THE  THIRD  INTERNATIONAL  CONFERENCE  ON  SOFTWARE 
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( 1 )  Elements  of  software  reliability 

There  are  several  works  in  analysis 
of  software  reliability  elements,  and  most 
of  them  are  by-products  of  general  study 
for  software  quality  which  aims  to  evaluate 
software  quality  qualitatively  or  quantita¬ 
tively.  Organized  software  quality  evalua¬ 
tion  was  firstly  done  by  Rubey  and  Hartwick 
[11 .  Brown  and  Lipow  [2]  formulated  some 
number  of  quality  metrics,  and  Wulf  [3] 
identified  and  provided  concise  definitions 
on  seven  software  quality  attributes. 
Abernathy,  et  al,  [4]  defined  a  number  of 
characteristics  of  operating  systems  and 
analyzed  some  tradeoffs  between  them. 
Recently,  a  framework  for  software  quality 
characteristics  was  established  by  Boehm, 
Brown  and  Lipow  [5] .  They  clearly  identi¬ 
fied  the  software  reliability  elements,  in 
addition  to  a  maintainability  and  a  porta¬ 
bility:  namely,  self-containedness,  accur¬ 
acy,  completeness,  robustness/integrity, 
and  consistency.  A  large  number  of  software 
quality-evaluation  metrics  were  defined, 
classified,  and  evaluated  with  respect  to 
their  potential  benefits,  quantifiability , 
and  ease  of  automation. 

However,  for  an  effective  and  quanti¬ 
tative  evaluation,  we  need  more  amount  of 
further  work  in  this  field. 

(2)  Software  reliability  versus  Hardware 

reliability 

As  a  typical  study  of  software  reli¬ 
ability  nature,  there  are  some  analyses 
discriminating  natures  between  hardware 
reliability  and  software  reliability  [6,7, 
8],  some  of  fundamantal  differences  bet¬ 
ween  the  natures  of  software  and  hardware 
that  affect  reliability  evaluation  are: 

al)  Software  systems  never  run  and  softwa¬ 
re  errors  cannot  be  met  without  any 
input . 

a2)  Software  is  the  transformation  of 

designer's  idea  into  a  symbolic  langu¬ 
age  for  computer  processing  and  soft¬ 
ware  reliability  is  only  dependent  on 
correct  design  and  the  expression  of 
this  design.  Hardware,  in  addition  to 
having  the  reliability  problems  in 
correct  design  and  expression  of  the 
design,  is  physical  in  nature  and 
subject  to  component  failure  patterns 
that  are  statistically  measurable. 
a3)  Software  components  do  not  degrade 

with  time  as  a  result  of  environmental 
stress  or  fatigue  effect  (i.e.  wear- 
out)  . 

a4)  No  imperfections  or  variations  are 

introduced  in  making  additional  copies 
of  a  piece  of  software  {except  possi¬ 
bly  for  a  class  of  easy-to-check 
copying  errors) . 

a5)  A  correction  of  software  fault  alters 

configuration  of  software  and  eliminates 
any  possibilities  of  its  reoccurrence. 


Repairs  of  a  software  configuration 
tend  to  alter  the  configuration,  unlike 
most  hardware  component  replacement 
repairs . 

a6)  A  comprehensive  failure  mode  and 

effect  analysis  is  impractical  for 
large  software,  because  of  the  large 
number  of  distinct  logic  paths. 
a7)  The  information  provided  by  detected 
errors  has  not  yet  been  accurately 
characterized,  limiting  its  usefulness 
in  predicting  the  remaining  number  of 
errors. 

a8)  There  is  no  standardized  approach  for 
exhaustively  testing  software  in  order 
to  assure  that  it  meets  all  operational 
requirements. 

These  affect  severely  to  the  software 
reliability  evaluation  techniques,  especia¬ 
lly  to  the  modelling,  and  lead  us  to  be 
careful  in  applying  hardware  reliability 
theory  to  software. 

(3 )  Software  error  data  collection  and 
analysfs 

A  main  portion  of  the  software  relia¬ 
bility  nature  analysis  might  be  a  study  for 
software  error  data  collection  and  statis¬ 
tical  analysis.  Namely,  that  is  to  collect 
and  analyze  information  on  when,  where,  how 
and  why  people  make  software  errors. 

For  several  years  past,  many  people 
have  made  much  efforts  to  collect  software 
error  data  during  software  development 
and  maintenance  processes.  We  find  some 
published  reports  on  software  error  data 
collections.  Such  data  are,  of  course, 
necessary  to  evaluate  a  reliability  of  a 
software  product  itself,  and  also  useful  to 
derive  and  validate  software  reliability 
models.  However,  strictly  speaking,  very 
few  data  have  been  collected  firmly  through 
an  entire  software  development  and  mainte¬ 
nance  phases.  Let  us  look  all  around  and 
give  some  example  studies  on  software  error 
data . 

Endres  [9]  collected  error  data  disco¬ 
vered  in  testing  stage  of  operating  systems 
development,  and  analyzed  them  in  terms  of 
a  distribution  of  error  occurrence  rate  and 
frequency  of  software  modification  for  each 
module.  The  report  contains  classified 
error  statistics  by  error  types  and  causes. 
Shooman  and  Bolsky  [10]  introduced  statis¬ 
tical  data  on  frequencies  of  error  occurr¬ 
ence  rate  for  each  error  type,  computer 
time  and  working  time  used  to  test,  compu¬ 
ter  time  and  working  time  to  correct  errors, 
changes  for  software,  and  some  other  with 
trouble  and  correction  report  forms. 

Thayer,  Lipow  and  Nelson  [11,12)  analyzed 
error  data  collected  by  some  projects  in 
TRW  precisely.  Besides  these,  there  are 
many  interesting  works  that  report  various 
kinds  of  error  data  from  various  points  of 
view:  studies  done  by  Boehm,  et  al  [13], 

Bell  and  Thayer  [14),  Fagan  [15],  Akiyama 
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116],  Gannon  and  Horning  (17],  McGeachie [18] , 
Musa  (19],  Litecky  and  Davis  (201,  Baker 
(21),  Miyamoto  (22],  and  Weinberg  [23],  for 
instance.  However,  most  of  the  previous 
studies  collected  and  analyzed  only  errors 
discovered  within  testing  stages.  Few  exce¬ 
ptions  are  an  analysis  of  software  require¬ 
ments  errors  by  Bell  and  Thayer  (14],  and  a 
design  error  analysis  by  Fagan  [15]. 

Software  error  information  should  be  collec¬ 
ted  and  analyzed  through  entire  software 
validations  and  operations  of  whole  software 
life  cycle. 

Most  of  the  previous  works  collected 
error  information  manually  by  using  document 
forms.  See  references  [10], [11]  and  [15] 
for  instance.  There  is  no  systematic 
approach  to  collect  error  information  auto¬ 
matically  by  using  an  automated  data  collec¬ 
tion  technique.  Though  SIMON  [24,25]  and 
Software  Factory  [26]  are  aiming  to  automate 
a  collection  of  project  management  data, 
they  only  can  collect  software  error  data 
to  be  detected  in  testing,  not  for  whole 
life  cycle  of  software. 

When  we  analyze  software  error  data  in 
testing  and  operational  stages,  we  have  to 
take  into  account  following  thing.  Since 
software  systems  never  run  and  software 
failures  cannot  be  detected  without  any 
inputs,  we  can  not  find  new  errors  without 
changing  a  test  domain  or  a  user  domain[22]. 
Then,  if  we  collected  information  on  change 
of  test  domain  or  user  domain  in  addition 
to  error  information  at  same  time,  we  would 
be  able  to  analyze  more  detailed  insights 
on  software  reliability.  This  point  of 
consideration  is  often  omitted  in  most  of 
previous  works. 

On  the  other  hand,  it  is  extremely 
difficult  to  compare  error  data  from  diffe¬ 
rent  sources.  It  is  a  big  problem  that  the 
results  of  previous  error  data  collection 
and  analysis  efforts  are  not  compatible  for 
each  other.  This  might  be  mainly  caused  by 
a  lack  of  standards  and  unified  approaches 
on  definitions  of  terminologies  used  in 
projects,  error  categorization,  data 
collection  procedures,  data  analysis,  soft¬ 
ware  error  database,  and  etc.  Though  it  is 
reported  that  some  efforts  are  now  under 
way  at  USAF  Rome  Air  Development  Center, 
and  within  IEEE  Technical  Committee  on 
Software  Engineering  [7],  formal  outputs 
are  not  available  by  now.  Because  it  requi¬ 
res  much  time  and  money  to  build  a  common 
error  database,  we  have  to  collect  and 
analyze  error  data  in  a  compatible  fashion 
as  far  as  we  can. 

(4)  Problems  in  error  data  collection  and 


Based  on  the  review  and  some  references 
[9,10,11],  we  can  summarize  major  problems 
in  error  data  collection  and  analysis  as 
follows : 

bl)  Software  development  and  maintenance 


projects,  the  software,  and  the  relia¬ 
bility  data  vary  considerably  and  not 
describable  in  common  terminology.  It 
is  extremely  difficult  to  compare 
error  data  from  different  sources. 
b2)  Some  projects  produce  data  that  are 
classified  as  they  like. 
b3)  Analysis  techniques  and  questions  to 
ask  of  the  data  are  not  well  known. 
b4)  Data  accuracy  is  a  chronic  question. 
b5)  Analysis  is  often  incomplete  or 

inaccurate  if  proper  communication  with 
project  performers  is  not  established. 
b6)  Project  organizational  structure  and 
resources  vary,  making  consistent, 
multi-project  data  collection  questi¬ 
onable. 

b7)  Definition  of  which  parameters  are 

needed  and  meaningful  to  collect  is  in 
its  infancy. 

b8)  Analysis  of  relationships  between 
error  information  and  the  system 
operations  (i.e.  test  domain,  user 
domain)  is  often  left  unnoticed. 
b9)  Definitions  of  measures  to  represent 
program  complexity  and  scale  are 
needed.  Analysis  of  correlations  bet¬ 
ween  program  complexity  or  scale,  and 
the  error  information  (e.g.  frequency) 
is  needed. 

blO)  Manual  data  collection  is  a  lot  of 
work. 

bll)  Certain  data  items  are  perishable  and 
must  be  collected  and  analyzed  timely 
when  they  become  available,  not  after 
the  fact. 

bl 2 )  There  is  no  guarantee  that  data  will 
be  collected  (i.e.  no  requirement  for 
projects  to  collect  data). 
bl3)  The  fervor  of  data  collection  inspires 
data  gathering  that  is  non-supportive 
of  software  development  processes. 
bl4)  Presently  implemented  data  collection 
schemes  often  fail  to  gather  data  in 
sufficient  detail,  making  results  of 
analysis  questionable. 
bl5)  Software  error  data  collection  is 

commonly  done  for  testing  stages,  not 
for  entire  software  life  cycle. 
bl6)  Software  reliability  data  collection 
can  represent  cost,  schedule,  and 
manpower  impediments  to  software 
development  projects.  The  impact  or 
cost  considerations  of  data  collection, 
although  real,  are  not  fully  appreci¬ 
ated  . 

bl7)  Performers,  project  management,  and 
even  the  customers  of  software  are 
sensitive  about  providing  data  that 
might  be  used  to  adversely  evaluate 
project  by  external  agencies. 
bl8)  Contractor  and  customer  representative 
of  project  management  are  not  aware  of 
the  benefits  of  data  nalysis  and 
therefore  tend  not  to  support  it. 
bl9)  Project  structure  is  generally  not 

tailored  to  use  available  data  (i.e. 
the  mechanism  for  analyzing  data  and  . 
folding  results  back  into  the  project 


is  not  provided) . 

b20)  Some  data  elements  require  protection 
to  preserve  the  privacy  of  the 
contributor  (e.g.  cost  data). 
b21)  Data  collection  is  commonly  thought 
to  be  "not  necessary"  to  a  properly 
managed  project. 

Some  of  these  problems  are  due  to  the  policy 
of  software  project  management. 

2 . 2  Software  Reliability  Evaluation 

The  second  direction  of  studies  is  to 
study  a  phenomenological  aspect  of  software 
reliability.  This  is  to  study  how  often  a 
software  is  deficient  while  it  operates,  in 
other  words.  Also,  this  direction  aims  at 
gaining  high  quality  of  software  as  a  final 
object.  However,  a  greater  emphasis  is 
placed  on  to  establish  an  effective  software 
reliability  evaluation  technique  itself. 

( 1 )  Metrics  for  software  reliability 

At  a  standpoint  of  phenomenological 
aspect,  a  software  reliability  and  metrics 
to  represent  a  degree  of  software  reliabi¬ 
lity  are  often  defined  probabilistically  as 
defined  before.  Metrics  are  defined  some¬ 
times  time-dependently  and  sometimes 
time- independently .  Most  metrics  were 
developed  originally  for  hardware  reliabi¬ 
lity:  for  instance,  availability  Alt), 
reliability  function  R(t),  mean  time  to 
failure  MTTF,  mean  time  between  failures 
MTBF ,  and  mean  time  to  repair  MTTR.  In 
addition  to  these,  there  are  some  other 
metrics,  such  like  mean  time  between  soft¬ 
ware  errors  MTBSE  [22],  some  rates  to 
represent  reliabilities  of  input  domain, 
test  domain,  and  user  domain  [22,27], 
However,  these  work  at  times,  but  often  are 
unable  to  explain  actual  experienced  soft¬ 
ware  reliability  phenomena.  This  is  prima¬ 
rily  because  of  fundamental  differences 
between  software  phenomenology  and  the 
assumptions  of  hardware  reliability 
theories,  and  secondly  because  of  a  lack  of 
practical  methodologies  for  application  to 
software.  There  is,  therefore,  no  single 
metric  which  reflects  the  software  natures 
and  can  give  a  universally  useful  rating 
of  software  reliability.  This  might  be 
thought  as  one  of  reasons  for  a  traditional 
criticism  insisting  that  it  is  impossible 
to  apply  hardware  reliability  theories  to 
software  reliability  problem. 

( 2 )  Software  reliability  models 

Now  let  us  review  a  software  reliabi¬ 
lity  modelling  which  is  the  most  theoretic¬ 
al  portion  of  software  reliability  evalua¬ 
tion  techniques. 

Shooman  [28,29],  Jelinski  and  Moranda 
[30,31]  proposed  similar  type  of  probabili¬ 
stic  models  for  a  removal  rate  of  software 
errors  during  test.  For  these  earliest 


software  reliability  models,  they  assumed 
an  error  detection  rate  was  proportional 
to  a  number  of  remaining  errors.  Shooman 
related  an  error  detection  rate  also  to  a 
program  size  and  an  instruction  processing 
rate.  He  investigated  some  other  types  of 
models  for  error  correction,  and  proposed 
a  two-point  parameter  estimation  approach. 
Schneidewind  [32]  suggested  an  empirical 
reliability  prediction  model  by  fitting 
failure  intervals  with  an  appropriate 
reliability  function.  He  applied  a  maximum 
likelihood  estimation  to  determine  parame¬ 
ters  of  error  detection  and  correction 
processes.  Littlewood  and  Verrall  [33] 
proposed  a  Bayesian  reliability  growth  model 
by  assuming  that  error  corrections  make 
smaller  a  failure  rate  in  a  probabilistic 
rather  than  deterministic  fashion.  Trivedi 
and  Shooman  [34]  developed  a  many-state 
Markov  model  for  an  estimation  of  reliabi¬ 
lity  function  R(t)  and  availability  A(t) 
during  testing  stages.  They  assumed  that 
error  and  correction  occur  alternately  and 
sequentially,  only  one  type  of  error  occurs, 
and  error  discovery  rate  is  constant,  in 
addition  to  basic  Markov  assumptions. 

Musa  [6]  developed  a  set  of  execution  time 
model  and  calendar  time  model.  He  assumes 
that  tests  are  representative  of  the  envir¬ 
onment  in  which  a  program  will  be  used  and 
are  continuously  global,  failures  are 
independent  of  each  other  and  distributed 
at  any  time  with  a  constant  average  execu¬ 
tion  time  occurrence  rate  that  is  proport¬ 
ional  to  a  number  of  errors  remaining,  all 
failures  are  observed,  and  an  error  correc¬ 
tion  rate  is  proportional  to  a  failure 
detection  rate  at  all  time,  for  the 
execution  time  model.  Also,  in  order  to 
relate  a  calendar  time  with  testing  activi¬ 
ties,  he  made  some  assumptions  for  utiliza¬ 
tion  of  resources  (i.e.  failure  identifica¬ 
tion  personnel,  failure  correction  person¬ 
nel,  computer  time) .  The  model  parameters 
are  calculated  by  using  a  maximum  likelihood 
estimation.  The  effectiveness  of  models  has 
oeen  validated  at  several  real  projects  [35]. 
This  is  one  of  few  models  which  have  a 
practical  methodology  and  a  tool  for 
application.  In  addition  to  these  models, 
there  were  some  other  models  developed  by 
Schick  and  Wolverton  [36],  Weiss,  Corcoran, 
and  Nelson  [37],  These  are  well  reviewed 
by  [11]  and  [48].  Also,  Littlewood  [38,39], 
and  Shooman  [40,41]  have  proposed  some 
revised  versions  of  their  reliability 
models  vigorously.  The  models  proposed  by 
Littlewood  are  based  on  Markov  processes. 
Shooman  has  developed  structural  model  in 
which  logical  paths  of  program  structure, 
execution  time  and  failure  rate  of  each 
path  are  related  with. 

Besides,  there  are  some  other  simple 
models  to  estimate  a  number  of  initial 
latent  errors  in  programs.  Such  models  do 
not  assume  much,  and  some  models  are  based 
on  the  statistics  of  historical  data  [43]. 

For  instance,  Mills  [42]  developed  an  error 
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seeding  model.  These  models  are  useful  some¬ 
times. 

(3)  Problems  in  reliability  models 

During  the  review,  we  found  many  prob¬ 
lems  in  software  reliability  models  for 
practical  applications.  Most  of  them  are 
derived  from  assumptions  applied  in  the  mod¬ 
elling  [8J.  Typical  worrisome  (?)  assumptions 
are  summarized  as  follows: 

cl)  The  errors  remaining  in  program  de¬ 
crease  monotonically . 

c2)  The  discovery  rate  is  proportional  to 
the  number  of  remaining  errors. 
c3)  New  errors  are  not  introduced  during 
debugging. 

c4)  The  software  failure  rate  (and  associ¬ 
ated  hazard  rate)  is  constant,  or 
increases,  or  decreases  during  failure 
intervals. 

c5)  The  hazard  rate  increases  or  decreases 
at  each  time  when  error  is  detected. 
c6)  Software  errors  have  the  same  likeli¬ 
hood  of  detection. 
c7)  All  failures  are  observed. 
c8)  Data  inputs  to  software  system  are 
randomly  selected. 

c9)  Software  failures  are  independent. 
clO)  Software  errors  have  the  same  effects 
on  system  operations, 
ell)  Test  and  users  operations  cover  the 
entire  input  domain  of  software  and 
do  not  change. 

cl2)  Error  correction  rate  is  proportional 
to  failure  detection  rate. 

Some  of  the  reliability  estimation  problems 
are  originated  from  assumptions  of  failure 
distribution,  and  some  others  are  originated 
from  assumptions  of  software  operational 
profiles  in  testing  stage  and  operational 
stage  at  users.  In  most  models,  it  is  as¬ 
sumed  that  a  test  domain  and  a  user  domain 
are  fixed  and  do  not  change.  We  should  note 
that  a  software  reliability  is  a  function  of, 
not  only  a  number  of  remaining  errors,  but 
also  their  effects  on  system  operation, 
software  operational  profiles  (which  are 
changeable  according  to  the  test  space 
growth  and  the  usage  patterns  of  software 
functions) ,  and  locations  of  remaining 
errors.  In  other  words,  the  operational 
reliability  is  a  probability  that  users  don't 
enter  specific  inputs  which  inevitably 
encounter  the  latent  errors. 

Surprisingly  very  few  practical  model 
application  methodologies  for  the  reliabil¬ 
ity  test  data  measurement  (which  should  be 
separated  from  testing  and  debugging)  and 
compilation  are  prepared  in  the  existing 
software  reliability  models. 

In  a  theoretical  sense,  there  is  no  single 
model  which  can  give  a  universally  useful 
estimation  of  software  reliability.  And 
still,  we  are  a  long  off  from  having  truely 
reliable  software  reliability  estimation 
models. 


As  a  whole,  we  can  conclude  that  we 
have  very  poor  software  reliability  evalua¬ 
tion  techniques  and  error  database,  for 
the  present. 

3.  SOFTWARE  RELIABILITY  MANAGEMENT  SYSTEM 

After  considerable  study  of  previous 
works  and  experiences  of  actual  projects, 
we  have  decided  to  improve  the  effectiveness 
of  current  software  reliability  evaluation 
by  developing  the  application  methodologies 
of  currently  existing  techniques  and  models, 
and  supporting  tools.  (We  have  not  make 
plans  to  develop  new  models  and  techniques, 
although  we  have  no  powerful  ones.) 

For  the  problems  of  reliability 
evaluation,  we  shall  deal  with  mainly  by 
using  multiple  reliability  metrics  at  the 
same  time,  preparing  specific  reliability 
data  measurement  method,  relating  with 
operational  profiles,  categorizing  failures 
according  to  their  effects  on  system  opera¬ 
tions,  and  limiting  the  objects  of  reliabi¬ 
lity  estimation.  The  effectiveness  of  error 
data  analysis  can  be  basically  improved  by 
developing  automated  tools  and  methodology 
for  data  collection  and  analysis,  and  by 
integrating  them  with  an  automated  system 
which  supports  entire  software  development 
and  maintenance  processes.  By  using  such  a 
system,  as  we  continue  to  collect  and 
analyze  more  and  more  data  on  how,  when, 
how  often,  where,  and  why  people  make 
software  errors,  and  how  people  detect  and 
correct  software  errors,  we  will  be  able  to 
get  clear  insights  on  how  to  evaluate 
software  reliability,  how  to  develop 
practical  models  for  prediction,  how  to 
avoid  making  software  errors,  and  how  to 
organize  the  validation  strategies. 

Along  this  baseline,  we  are  designing 
an  automated  software  reliability  management 
tool,  and  a  highly  ambitious  software 
development  and  maintenance  support  system. 
This  section  will  introduce  a  brief 
description  for  a  concept  of  this  system. 

3 . 1  Software  Development  and  Maintenance 
Support  System 

In  order  to  support  various  activities 
for  large  scale  software  development  and 
maintenance,  SDMSS  (Software  Development 
and  Maintenance  Support  System)  is  designed 
to  have  several  basic  subsystems  as  shown 
in  Fig.  1.  Requirements  Engineering  Sub¬ 
system  consists  of  a  language  processor  for 
a  requirements  definition  language  RDL,  and 
analysis  tools  which  examine  correctness, 
consistency,  completeness,  and  feasibility 
of  software  requirements  described  in  terms 
of  RDL.  Requirements  descriptions  are  stored 
into  a  development  database  and  maintained 
as  an  abstract  system  model.  Design  Sub¬ 
system  consists  of  a  processor  for  software 
design  language  SDL  and  tools  to  examine 
correctness,  consistency,  completeness  and 
efficiency  of  designed  software. 
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Design  descriptions  are  stored  into  database 
and  maintained  as  a  logical  system  model. 
Programming  Subsystem  consists  of  program¬ 
ming  language  processors  and  source  code 
analysis  tools.  Static  code  analysis  is 
done  by  program  structure  checker,  module 
interface  checker,  events  sequence  checker 
and  diagnostic  functions  of  language  proce¬ 
ssors.  Test  Subsystem  is  made  up  of  test 
description  language  processor,  and  tools 
to  support  test  case  selection,  test  data 
generation,  test  execution,  test  result 
reporting,  maintenance  of  test  database  for 
dynamic  tests.  A  test  description  language 
TDL  is  useful  to  describe  test  drivers  and 
stubs.  Maintenance  Subsystem  supports 
maintenance  activities  originated  from 
software  problem  reports  or  maintenance 
orders.  History  of  software  modifications 
is  reco'ded  in  maintenance  database.  This 
subsystem  internally  uses  functions  of 
Requirements  Engineering  Subsystem,  Design 
Subsystem,  Programming  Subsystem  and  Test 
Subsystem  to  redefine  software  requirements, 
redesign,  recode,  and  retest.  A  tool  to 
find  out  modules,  specifications  and  test 
cases  associated  with  a  requested  correct¬ 
ion  or  modification  is  provided.  In 
addition  to  these,  SDMSS  has  Product  Mana¬ 
gement  Subsystem,  Document  Subsystem, 

Project  Management  Subsystem,  and  Software 
Error  Management  Subsystem  (SEMS) . 

Project  Management  Subsystem  collects  data 
on  cost  and  project  progress,  and  monitors 
software  development  and  maintenance 
activities  for  project  management. 

Except  Requirements  Engineering  Sub¬ 
system,  SDMSS  is  now  in  design  stage. 

3 . 2  Software  Error  Management  Subsystem 

Associating  with  software  validation 
activities  and  maintenance  activities  for 
an  entire  software  life  cycle,  with  a  help 
by  SDMSS,  SEMS  provides  a  set  of  methodo¬ 
logies  and  tools  which  mainly  support  a 
collection  of  software  error  data  produced, 
a  statistical  error  data  analysis,  a  repo¬ 
rting  error  information  to  managers  (e.g. 

QA  manager)  for  swift  feedbacking  or  feed¬ 
forwarding  to  software  activities,  a 
management  of  error  correction  activities, 
and  a  prediction  of  software  reliability 
during  testing  stages.  SEMS  intends  to  give 
solutions  or  improvements  to  some  problems 
in  software  reliability  evaluation  and 
stimulate  the  rest. 

(1)  Components 

SEMS  consists  of  four  major  components. 
Error  Management  Program  manages  and  maint¬ 
ains  error  information  registered  and 
status  information  of  correcting  activities, 
and  controls  interfaces  to  other  subsystems 
in  order  to  collect  error  data  automatica¬ 
lly.  Reliability  Estimation  Program 
calculates  a  future  software  reliability  at 
testing  stages.  This  program  contains  two 


different  reliability  models.  The  Reporting 
Tools  output  statistical  reports  on  catego¬ 
rized  errors,  and  status  reports  or.  error 
correcting  activities  periodically  and  at 
any  time  when  needed.  SEMS  is  extensible  to 
add  new  tools.  Error  Database  is  made  up  of 
events  file,  relation  file,  and  statistic 
file  as  shown  in  Fig.  2.  Events  file 
maintains  error  information  in  chronological 
order  of  occurrence.  Each  error  occurrence 
uses  one  record.  Example  error  record  format 
for  dynamic  testing  stage  is  also  shown  in 
Fig.  2.  An  error  record  contains  four  parts 
the  first  part  keeps,  in  main,  initial 
information  on  a  situation  of  failure  detec¬ 
tion,  the  second  part  maintains  status  inf¬ 
ormation  for  correction  activities,  the 
third  part  keeps  formal  results  of  error 
cause  analysis,  and  the  last  part  is  for 
comments.  Relation  file  maintains  relations 
between  error  records.  Relations  indicate 
sets  of  categorized  errors  in  terms  of,  for 
example,  requirements  errors,  design  errors, 
coding  errors,  or  for  each  version  of  soft¬ 
ware,  module,  and  many  others.  Reporting 
tools  can  make  reports  very  easily  by  refe- 
ring  to  this  file.  Statistic  file  contains 
various  kinds  of  statistics  on  error  data, 
such  as,  accumulated  number  of  error  occur¬ 
rences,  accumulated  number  of  co~rected 
errors,  current  number  of  errors  remained 
uncorrected,  and  reliability  estimation 
results. 

Totally,  SEMS  has  6  basic  types  of 
procedures  to  interact  with  these  components 
They  are  defined  for  a  manual  registration 
of  error  information,  an  automatic  registr¬ 
ation  of  error  information,  a  status  regis¬ 
tration  of  error  correction  activities,  an 
inquiry  for  statistical  error  information 
or  status  information  of  each  error,  an 
authorization  of  error,  and  a  reliability 
estimation. 

( 2 )  Sources  of  error  information 

Since,  in  general,  software  projects 
have  a  potential  for  creating  a  tremendous 
amount  of  various  kinds  of  data,  and  error 
data  are  largely  a  by-product  of  software 
development  and  maintenance  processes,  we 
need  to  collect  meaningful  data  items 
efficiently  and  timely  for  an  entire  soft¬ 
ware  life  cycle.  Fig.  3  illustrates  a 
typical  software  lire  cycle  by  phase  and 
several  types  of  software  error  data.  These 
software  error  data  form  a  continuum  of 
requirements  errors,  design  errors,  and 
coding  errors  starting  as  early  as  the 
software  requirements  definition  phase  and 
extending  into  the  operational  phase. 
Software  validations  as  error  information 
sources  by  SDMSS  are  as  follows.  In  requi¬ 
rements  definition  phase,  some  of  software 
requirements  errors  can  be  detected  by  RDL 
processor  and  associated  static  analyzers 
automatically,  and  some  others  can  be 
detected  by  an  inspection  review  of  require¬ 
ments  specification  manually.  In  design 
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phase,  some  of  software  design  errors  can 
be  detected  by  SDL  processor  and  associated 
tools  automatically,  and  some  others  can  be 
detected  by  an  inspection  review  of  design 
specifications  manually.  Errors  on  software 
efficiency  can  also  be  detected  by  using  a 
simulation  tool.  In  coding  stage,  some  types 
of  software  errors  are  detected  by  diagnos¬ 
tic  functions  of  language  processors  and 
static  source  code  analyzers  automatically, 
and  some  others  are  detected  by  an  inspec¬ 
tion  review  manually.  In  testing  stage,  the 
rests  of  software  errors  should  be  detected 
by  dynamic  tests.  Since  results  of  tests 
described  by  using  TDL  are  self-checkable, 
failures  are  detected  by  automated  tool. 

In  other  cases,  failures  are  detected  by 
manual  analysis  and  reported  in  terms  of 
software  problem  reports.  In  operational 
phase,  users  find  software  failures  and 
report  them  manually  by  using  software 
problem  report.  All  of  these  activities 
generate  error  information. 

There  are  three  types  of  information 
sources.  The  first  one  is  a  static  test 
result  of  specifications  or  source  codes. 
This  kind  of  error  information  can  be  sent 
to  SEMS  automatically  by  static  analysis 
tools.  The  second  type  is  a  dynamic  test 
result.  In  this  case,  Test  Management  Prog¬ 
ram  in  Test  Subsystem  sends  error  informa¬ 
tion  to  SEMS  automatically.  The  third  type 
is  a  result  of  manual  inspections.  This 
kind  of  error  information  must  be  registe¬ 
red  to  SEMS  manually.  Generally,  a  manual 
technique  is  easy  to  implement,  but  it 
requires  manpower  committed  to  a  collection 
task.  Automated  techniques  ease  a  pain  of 
collection  task,  and  can  collect  error  data 
effectively  and  timely  when  it  is  generated. 
However,  automated  techniques  tend  to  be 
less  flexible  in  response  to  changes  in 
software  project  demands  and  may  cause  cost 
problems  of  implementation.  And  yet,  it  is 
often  needed  to  judge  manually  in  order  to 
collect  meaningful  data.  Then,  SEMS  has 
employed  both  of  manual  and  automatic  data 
collection  techniques  so  that  meaningful 
error  data  can  be  collected  effectively. 

(3)  Supporting  functions  for  automated 
data  collection 

Now,  taking  a  case  of  a  test  descrip¬ 
tion  language  TDL,  we  would  like  to  intro¬ 
duce  supporting  functions  for  an  automatic 
data  collection.  Since  a  complete  descrip¬ 
tion  of  TDL  is  beyond  the  scope  of  this 
paper,  we  introduce  only  some  features  of 
TDL  and  TDL  processor  for  a  test  driver 
description.  (TDL  is  partly  based  on  [46].) 

A  TDL  description  for  a  test  driver 
has  a  following  general  form. 

''test  description  > 

''test  definition-  1> 

^test  definition-  2> 


<test  definition-  n^> 

END 

Each  test  defined  in  the  test  procedure 
description  causes  a  partial  execution  of 
target  modules.  Tests  are  performed  one  at 
a  time  in  the  order  in  which  they  appear  in 
test  definitions.  In  the  test  description, 
what  a  set  of  tests  will  examine  is  written 
with  information  on  testing  environment. 

A  complete  form  is  as  follows. 

<test  description >  = 

<test  case  identifications 
<error  category  identifications 
<hardware  conf iguration  identifications 
<software  conf iguration  identifications 
<test  system  identifications 

These  information  specified  in  test  descrip¬ 
tion  are  sent  to  SEMS  at  time  when  a  failure 
is  met.  Each  test  execution  of  target 
modules  is  described  in  a  test  definition. 

A  test  definition  has  four  items  as  shown 
in  next. 

<test  definition-  i>= 

Cinitialization  codes> 

<assertions  > 

<execution  directives 

<error  category  identification  detail> 

Immediately  before  a  test  execution,  the 
variables  specified  in  tue  initialization 
code  statements  are  initialized.  A  form 
for  initialization  is 

<target  module  name/variable, initial  value> 

The  most  test  definitions  contain  assertions 
about  the  performance  of  the  target  modules. 
They  are  specified  in  VERIFY  statements. 

A  form  is 

VERIFY  [at  LI ,L2 ,.. ,Li, .., Ln]  (assertions) 

Li  is  a  label  reference  to  target  modules, 
where  the  assertions  are  to  be  verified, 
and  is  written  in  a  form  [module  name: label]. 
When  no  labels  are  specified,  the  assertions 
are  verified  immediately  after  the  test 
execution  terminates.  An  assertion  can  be 
specified  in  any  way  of  next  three  types: 

(module  name:  logical  expression) 

(PATH-IS:  regular  expression) 

(TIME- IS-LESS-THAN :  value) 

Logical  expressions  are  test  predicates  usi¬ 
ng  variables  and  values.  An  argument  to  a 
PATH-IS  assertion  is  a  regular  expression 
made  up  from  operation  names.  Regular  expr¬ 
essions  permits  use  of  four  operators.  The 
operator  ;  is  a  sequencing  operator,  speci¬ 
fying  an  order  in  which  the  modules  are 
supposed  to  be  executed.  The  operator  +  is 
an  alteration  operator  specifying  either  one 
of  the  modules  can  be  executed.  The  operand 
*  is  used  to  denote  repetition.  The  operand 
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&  can  be  used  to  denote  concurrent  process¬ 
ing  of  modules.  The  assertion  for  TIME-IS- 
LESS-THAN  is  used  to  check  a  execution  time 
of  target  modules.  Every  test  definition  is 
executed  by  an  EXECUTE  statement  of  the 
following  form. 

EXECUTE  [FROM  LI] [TO  L2 , . . . , Li , . . Ln] 

where  Li  is  a  label  reference  to  target 
modules.  This  statement  specifies  the  first 
and  last  statements  to  be  executed.  A  STOP 
or  RETURN  in  target  modules  terminates  only 
the  current  test  execution.  The  error  cate¬ 
gory  identification  is  a  detailed  category 
identification  for  an  error  which  is  expec¬ 
ted  to  be  detected  by  this  test  definition, 
and  should  be  consistent  with  senior  cate¬ 
gory  identification  specified  in  the  part 
of  test  description. 

Internal  procedures  for  an  automatic 
collection  of  error  information  is  roughly 
shown  in  Fig.  4.  Firstly,  immediately 
after  a  test  execution,  a  result  is  checked 
whether  it  agrees  with  a  specified  assert¬ 
ion.  In  the  case  that  it  does  not  agree 
with,  an  event  indicating  a  failure  occurr¬ 
ence  is  informed  to  SEMS,  and  the  informa¬ 
tion  written  in  a  part  of  test  description 
and  error  category  information  are  entered 
into  Error  Data  Queue  by  test  management 
program.  At  this  time,  test  management  pro¬ 
gram  adds  a  calendar  time  when  failure  was 
detected,  name  of  inspector  for  current 
test  execution  (Test  Subsystem  knows  a  name 
of  current  user),  and  error  identification 
number  which  is  serial  through  the  current 
validation  activity.  On  the  other  hand, 

SEMS  takes  out  an  entry  of  error  informat¬ 
ion  from  the  error  data  queue  and  stores 
it  into  error  database  one  by  one.  At  this 
time,  a  serial  number  which  is  counted 
through  whole  software  life  cycle  is  added 
to  the  second  part  of  an  error  identifica¬ 
tion  number.  At  a  time  when  a  failure  event 
is  registered  in  error  database,  since  an 
error  has  not  been  judged  formally  as  a 
true  software  error,  this  is  recognized  as 
a  quasi  error  temporarily.  A  true  cause  and 
a  location  of  the  error  can  not  be  made 
clear  until  all  failure  analyses  complete, 
ordinarily.  Formal  analysis  results  often 
do  not  agree  with  an  initial  judgement  for 
a  quasi  error.  The  formal  error  information 
are  stored  in  the  third  part  of  error  reco¬ 
rd  by  QA  manager  in  manual.  Besides,  as 
correction  activities  progress  step  by  step, 
each  status  is  registered  and  updated  with 
information  on  committed  working  hours, 
computer  time  used,  and  some  others  in  the 
second  part  of  error  record.  These  inform¬ 
ation  are  useful  to  manage  error  correction 
activities. 

3. 3  Software  reliability  estimation 

SEMS  provides  a  software  reliability 
estimation  capability  and  prepares  a  metho¬ 
dology  for  an  estimation  models  application. 


(1)  Software  reliability  models 

Reliability  Estimation  Program  contains 
two  different  software  reliability  models: 
they  are  a  model  developed  by  Musa  [6]  (The 
calculation  modules  for  this  model  in  Relia¬ 
bility  Estimation  Program  is  originally 
developed  by  Musa  [47].)  and  a  simple  model 
developed  by  the  author  [22].  As  reviewed 
before,  Musa's  model  is  based  on  several 
assumptions,  and  the  author's  model  is  also 
based  on  some  basic  assumptions.  These  are 
not  exceptions.  However,  we  should  like  to 
bring  up  basic  questions  here.  (In  this 
section,  we  discuss  reliability  models  from 
a  viewpoint  of  practical  use,  not  a  theoret¬ 
ical  aspect  described  before.) 

Are  these  assumptions  (e.g.  assumptions  c2, 
c4,  c7 ,  c8,  c9,  ell,  and  c 12)  realy  quite 
inappropriate? 

We  are  sure  that  they  are  not  the  ideal 
models.  By  satisfying  some  conditions  in 
some  ways,  can't  we  use  them  for  a  restric¬ 
ted  environment  specifically? 

Answers  might  be  probably  proved  by 
our  experience  and  intuition.  If  we  could 
have  some  specific  methodologies  for  a 
measurement  and  compilation  of  reliability 
test  data,  restrict  objects  of  application, 
and  modify  debugging  and  testing  processes 
in  order  to  satisfy  the  assumptions,  we 
would  be  able  to  decrease  their  bad  effects 
on  the  validity  of  estimation.  As  a  matter  of 
fact,  the  effectiveness  of  Musa's  model  has 
been  validated  by  several  real  projects [ 35] . 
The  author's  model  is  developed  only  for  a 
highly  dedicated  online  real  time  system  and 
has  specific  reliability  test  data  measure¬ 
ment  and  compilation  methods  [22].  We  may 
say  that  we  would  be  able  to  apply  them  for 
practical  use  if  only  we  had  taken  some 
points  into  account.  In  spite  of  this  poss¬ 
ibility,  some  theorists  might  say  that,  in 
any  case,  it  is  still  difficult  to  estimate 
universally  by  a  single  model. 

There  is  a  good  answer  to  this.  In  order  to 
obtain  more  effectiveness,  it  is  better  to 
apply  multiple  models  to  software  reliabili¬ 
ty  estimation  jointly.  This  is  a  main  reason 
why  SEMS  has  two  different  models.  The 
author's  model  is  now  under  way  for  the 
revision.  Revised  model  is  going  to  reflect 
the  changes  in  system  operational  profiles 
during  testing  and  operational  stages. 

(2)  Methodology  for  application 

A  procedure  to  estimate  software  reli¬ 
ability  in  SEMS  environment  is  shown  in  Fig . 
5.  Input  data  to  reliability  estimation 
program  consists  of  a  number  of  parameters 
which  are  grouped  into  5  categories: 
failure,  planned,  debug  environment,  test 
environment,  and  program.  The  failure  data 
consists  of  a  set  of  execution  time  intervals 
between  failures,  along  with  a  number  of  days 
from  the  start  of  testing  on  which  failures 
occurred.  In  a  category  of  planned  data, 
there  are  available  computer  time  (measured 
in  terms  of  prescribed  work  periods) , 
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number  of  available  failure  correction 
personnel,  number  of  available  failure  iden¬ 
tification  personnel,  computer  time  utiliza¬ 
tion  factor,  failure  correction  personnel 
utilization  factor,  and  objective  mean  time 
to  failure.  In  a  category  of  debug  environ¬ 
ment,  there  are  average  computer  time  expen¬ 
ded  per  unit  execution  time,  average  compu¬ 
ter  time  required  per  failure,  average 
failure  identification  work  expended 

per  unit  execution  time,  average  failure 
correction  work  required  per  failure,  and 
the  average  failure  identification  work 
required  per  failure.  A  testing  compression 
factor  is  a  parameter  of  test  environment. 

As  parameters  of  category  of  program,  there 
are  a  number  of  failures  required  to  expose 
and  remove  all  errors  Mo,  and  initial  MTTF 
at  start  of  testing.  The  parameter  Mo  must 
initially  be  estimated  from  a  number  of 
inherent  errors  and  an  error  reduction 
factor  by  using  previous  statistics  [9,11, 
16,22]  or  by  using  simple  models  [43]. 
However,  once  data  are  available  on  inter¬ 
vals  between  failures  during  test  or  opera¬ 
tion,  these  parameters  may  be  reestimated. 
These  data  can  be  provided  by  reliability 
test  data  measurement,  error  database,  and 
project  database.  Ways  to  handle  these 
parameters  are  well  described  in  [35]. 

Reliability  test  data  measurements 
are  different  from  ordinary  testing  activi¬ 
ties  and  are  done  periodically  or  at.  any 
time  needed  in  order  to  collect  data  on 
current  reliability.  (Test  cases  are  gene¬ 
rally  chosen  to  quickly  detect  errors  and 
are  not  representative  of  operational  use.) 
During  reliability  test  data  measurement, 
execution  time  intervals  between  failures 
are  measured  for  each  categorized  class  of 
failures.  Failures  are  categorized  accord¬ 
ing  to  their  effects  on  system  operations. 
Example  categorization  gives  four  classes 
of  failures:  catastrophic  failure  which 
leads  to  system  down,  serious  failure 
which  causes  malfunctions  on  all  current 
users,  moderate  failure  which  causes  mal¬ 
function  on  one  of  current  users,  and 
trivial  failure  which  is  typographical 
error  in  output  messages.  Since  an  error 
correction  may  alter  a  software  configura¬ 
tion  and  decreases  statistical  meaning, 
detected  errors  are  not  corrected  for  the 
currently  used  system  during  current  data 
measurement.  Input  data  in  test  space  are 
selected  through  a  prediction  of  user  space 
initially,  and  are  not  changed  through  the 
measurement.  Test  space  must  be  reflecting 
possible  combinations  of  basic  users  opera¬ 
tional  patterns.  (Recall  that  software 
reliability  is  affected  by  system  operation 
strongly.)  Sample  reliability  test  data 
measurement  methods  are  introduced  in  [22] 
and  ( 28 ] . 

Using  collected  data,  two  models 
estimate  the  software  reliability  growth. 
Estimation  might  be  plotted  as  shown  in 
Fig.  6  with  measured  reliability,  for 
instance.  It  is  strongly  recommended  to 
evaluate  software  reliability  for  each 


categorized  class  of  failures.  A  priority 
of  failure  analysis  and  correction  activity, 
and  an  allotment  of  resources  for  each 
category  of  failures  must  be  determined 
prior  to  the  estimation. 

The  software  systems  as  the  best 
objects  of  reliability  estimation  by  SEMS 
should  have  following  characteristics. 

Namely,  their  input  spaces  must  be  large 
enough,  and  used  randomly  and  independently 
by  many  number  of  users  simultaneously  (for 
random  input) .  A  unit  of  system  services 
(e.g.  an  interaction  or  a  connection)  must 
be  small,  and  system  operational  profiles 
at  users  should  be  predictable  and  station¬ 
ary  at  any  time  while  they  operate.  System 
operations  can  be  classified  into  some 
states  clearly  in  order  to  ease  an  effect 
analysis  of  software  errors  on  system 
operation. 

From  these  points  of  view,  SEMS  recommends 
to  apply  its  estimation  technique  to  online 
real  time  transaction-driven  systems  with 
a  number  of  user  terminals,  in  main,  such 
like  large  bank  systems,  reservation  systems, 
inventory  systems,  information  retrieval 
system,  electronic  switching  system,  and 
etc.  Typical  example  system  is  shown  in 
[22].  In  the  case  of  other  type  of  systems 
(e.g.  batch  processing  systems),  effective¬ 
ness  of  estimation  might  be  decreased  more 
than  above  cases. 

Finally,  we  should  note  that  estimated 
results  do  not  show  an  absolute  certainty 
in  the  future  reliability  growth.  They  are 
only  useful  to  fix  our  aim.  Because  soft¬ 
ware  reliability  growth  rates  are  often 
changed  by  a  project  management  policy, 
ways  of  testing  and  debugging,  individual 
abilities,  positivisms  and  morals  of 
project  performers,  and  by  many  other 
factors  greatly. 

4.  CONCLUDING  REMARKS 

We  have  reviewed  the  software  relia¬ 
bility  evaluation  techniques,  and  described 
some  approaches  to  improve  some  of  the 
major  problems,  briefly.  Finally,  let  us 
assess  the  effectiveness  of  SEMS  on  the 
software  reliability  evaluation,  especially 
on  what  will  be  improved  by  SEMS. 

( 1 )  On  error  data  collection  and  analysis 

Basically,  the  automatic  data  collec¬ 
ting  function  of  SEMS  which  operates  with 
SDMSS  as  one  system  may  improve  the  problems 
of  blO,  bll,  bl2 ,  bl 3 ,  bl4 ,  and  bl5.  For 
the  rest,  we  may  classify  them  into  three 
categories:  1)  the  problems  on  standardi¬ 
zation  or  unification  (i.e.  bl ,  b2,  b3,  b4 , 
b5,  b6 ,  b7 ,  b8,  and  b9) ,  2)  the  problems 
associated  with  the  project  management 
policies  (i.e.  bl7,  bl8,  bl9,  b20,  and 
b21),  and  3)  the  cost  problem  (i.e.  bl6) . 

The  first  kind  of  problems  may  be 
improved  by  piling  up  the  use  experiences 
of  SDMSS.  Since  SDMSS  and  SEMS  provide  the 
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standardized  methodology,  terminology  and 
techniques  for  the  software,  software  deve¬ 
lopment  and  maintenance  project,  and  soft¬ 
ware  reliability  evaluation,  the  pile  of 
collected  and  analyzed  error  information 
would  be  compatible  for  each  information 
source.  In  order  to  solve  this  kind  of 
problems  in  the  broad  sense,  we  may  need 
some  theoretical  works  and  the  joint  efforts 
with  public  organizations.  The  poor  compa¬ 
tibility  of  error  data  is  caused  partially 
by  the  lack  of  the  valid  measures  to 
represent  program  complexity  and  scale. 

The  statistical  error  data  analysis  must  be 
done  in  terms  of  such  measures  independent¬ 
ly  with  the  kind  of  programs  and  programmi¬ 
ng  languages  or  design  languages.  The 
theoretical  work  is  needed  to  get  the  valid 
and  common  measures  for  software.  Currently 
there  are  some  statistical  approaches  to 
this.  For  instance,  Akiyama  [16],  Thayer, 
Lipow  and  Nelson  [11],  Halstead  [44],  and 
Green  [45]  have  reported  some  results  on 
the  correlations  between  error  data  and 
this  kind  of  measures.  In  SEMS,  since  the 
relations  between  error  database  and  the 
development  database  are  maintained,  it  is 
rather  easy  to  analyze  the  correlations 
between  program  complexity  and  scale,  and 
error  data  statistics,  (c.f.  bl,  and  b9) 
Besides,  SEMS  provides  a  set  of  error 
categorizations.  They  may  categorize  errors 
according  to  their  causes,  their  effects  on 
the  system  operations,  their  locations, 
the  phases  in  which  they  are  introduced, 
their  hardness  for  correction,  the  phases 
in  which  they  are  detected,  the  detection 
methods,  and  other  characteristics. 

In  the  environment  where  SEMS  is  used, 
since  the  multiple  categorizations  are  made, 
the  more  amount  of  information  is  provided 
than  the  cases  that  single  categorization 
is  employed,  (c.f.  bl,  b2,  and  b6) 

Most  of  statistical  works  and  reporting 
works  are  done  by  SEMS.  Thus,  users  are  not 
needed  to  do  troublesome  work.  S’SMS  is 
extensible  to  add  new  tools  by  users  in 
order  of  extraction  and  analysis  of  error 
information,  (c.f.  b3) 

The  data  accuracy  may  be  improved  and  bal¬ 
anced  more  than  before,  because  SEMS 
collects  error  data  in  standardized  manner, 
(c.f.  b4 ) 

On  the  other  hand,  since  SEMS  is  jointly 
used  with  SDMSS  in  general,  the  analysis 
works  can  be  done  on  intimate  relations 
with  the  development  and  maintenance  acti¬ 
vities.  (c.f.  b5)  However,  QA  manager 
must  take  more  responsibilities  than 
before.  SEMS  can  refer  the  information  of 
system  operations  related  with  the  error 
information  in  testing  and  operational 
stages,  (c.f.  b8) 

Generally,  it  is  not  well  known  that  what 
kind  of  data  items  are  meaningful  and  how 
much  we  need  for  software  reliability 
evaluation  during  the  development.  Thus,  it 
is  important  and  meaningful  to  accumulate 
the  error  information  and  to  gain  the 


experience  by  using  automated  tool  such  like 
SEMS.  (c.f.  b7 ) 

Most  of  the  second  class  of  problems 
are  beyond  the  scope  of  SEMS.  For  instance, 
the  problems  bl8  and  b21  can  not  be  solved 
without  changing  the  consciousness  of 
project  managers.  However,  SEMS  provides 
the  protection  function  for  error  database, 
(c.f.  bl7 ,  b20) 

Finally,  to  implement  and  use  SEMS  do 
cost  more  than  manual  evaluation.  Compared 
with  the  SDMSS,  SEMS  spends  little  amount 
of  computer  time  and  o'  er  resources.  The 
cost  problem  is  very  common  to  every  auto¬ 
mated  tool.  Since  the  software  reliability 
is  the  most  important  problem  in  software 
industries,  we  must  do  something  for  it. 

SEMS  can  collect  and  analyze  error  data 
more  efficiently  than  other  existing  methods 

Totally,  SEMS  can  improve  some  of  the 
major  problems  on  software  error  data 
collection  and  analysis. 

(2)  On  software  reliability  evaluation 

After  the  review  of  existing  techniques 
we  have  concluded  that  there  is  no  single 
metric  or  reliability  model  which  can  give 
a  universally  useful  reliability  evaluation 
or  an  estimation,  from  the  theoretical  point 
of  view.  However,  by  taking  into  account 
the  application  methodologies  which  cover 
the  measurement  and  compilation  of  reliabi¬ 
lity  test  data,  restrict  the  object  of 
application,  and  modify  the  debugging  and 
testing  processes  to  meet  with  assumptions 
made,  this  conclusion  may  be  changed,  in 
some  specific  environment,  we  can  obtain 
useful  estimation  to  fix  our  aim,  by  using 
multiple  reliability  models.  Though  we  are 
developing  new  models  vigorously,  we  should 
place  greater  emphasis  on  the  development 
of  application  methodologies  and  the 
estimation-aid  tools.  In  this  sense,  SEMS 
may  stimulate  the  studies. 

Finally,  we  would  like  to  claim  again. 
By  using  such  a  system  we  proposed,  as  we 
continue  to  collect  and  analyze  more  and 
more  data  on  how,  how  often,  when,  where, 
and  why  people  make  software  errors,  and 
how  people  detect  and  correct  software 
errors,  we  will  be  able  to  get  clear  insig¬ 
hts  on  how  to  evaluate  software  reliability, 
how  to  develop  practical  models  for 
predicting  software  reliability,  how  to 
avoid  making  software  errors,  and,  how  to 
organize  validation  strategies. 
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Fig.  1.  Software  Error  Management  System 
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ERROR  DATABASE 


Events  File 

J-<U////Z. 


Part  (1) 


Part  (2) 


Part  (3) 


Error  Database 


Part  (4) 


Basic  Record  Format 


! .Error  Id  Number  ixxxxxx-xxxxxx 
i. Category  Id  (tentative) 

-classification  #1,#2,#3,*4 . 

.Detected  by  (name) 

•Date  &  Time  detected 
.System  information 

-Hardware  Configuration  Id 
-Software  Configuration  Id 
-Database  Id 
-Test  System  Id 
.Failure  Location  (tentative) 

-Module (name, version  i  rev.  #, author) 
-Spec. (name, version  &  rev.#,  page) 
•Associated  Test  Cases 
-Validation  Id 
-Test  case  and  Data  Id 
.Associated  Trouble  Report  or 
Maintenance  Order  id. 


•Status  Information 

-Nothing  done  (date) 

-Error  analysis  (date/working  hours) 
.Correction  Started  (date) 

-Redefinition  of  requirements 
validation 

-Redesign  data 

validation  computer  time 

-Recoding  working  hours 

validation  7  • 

.Correction  completed  (date/work  hour) 
.Demonstrative  Test  (date/work  hour) 
•Find-and-fix  cycle  of  error 
■Approved  date 


.Formal  Judgement 

-hardware  error 
-operation  error 
-testing  error 
-unknown  (reoccurable?) 

.Software  error 

-formal  category  Ids  #1 , #2 , # 3, . .  . #n 
-formal  location 

module, page , line, author 
spec. , page , line , author 
-error  produced  by  erroneous  debug 
original  error  id 
.Associated  Errors  (ids) 

.Accountable  or  not 


l 

i 


i 


l 

I 


.comments  on  this  error 


!l 

•i 


2 


FEEDBACK  &  FEEDFORWARD 


INFORMATION  SOURCES 


reliability 

DATA 

MEASUREMENT 


_ i _ 

i 

iC 

•Failure 

Intervals 

PROJ. 

DATA 

.Test  Environment 

.Planned 

DATA 

t 

i 

BASE 

I 

* 

U 

ESTIMA- 

. Program 

^ - ^ 

QA  MANAGER 


Estimated! 
Results 


.Debug  Environment 


.  Results 


(statistic  file) 


Fig.  5.  Reliability  Estimation  Procedure 


Fiq.  6.  Comparable  Estimation  of  MTTF 


209 


ANALYSIS  OK  SOFTWARE  ERROR  MODEL  PREDICTIONS 
AND  QUESTIONS  OK  DATA  AVAILABILITY 


Alan  N.  Sukert 


Rome  Air  Development  Center  (AFSC) 
Griffiss  AKB  NY  1 3441 


Abstract 


During  the  period  Aug  1974  to  May  1978  a  study  to 
evaluate  the  accuracy  of  predictions  of  several 
models  for  predicting  the  error  content  and 
reliability  of  a  software  package  against  error 
data  extracted  from  four  large  Department  of 
Defense  software  development  projects  was 
undertaken  by  Rone  Air  Development  Center  (RADC). 
This  paper  will  briefly  describe  the  results  of 
this  empirical  study  for  three  such  models,  the 
Jelinski-Moranda,  Schick-Wolverton  and  a  modified 
Schick-Wolverton.  Model  predictions  will  he 
analyzed  and  general  conclusions  will  be  drawn  as 
to  model  applicability.  The  data  requirements  for 
performance  of  such  analysis  will  he  discussed  in 
lieu  of  the  data  RA DC  had  available  for  this  study 
and  the  data  needed  for  such  a  study. 

Introduction 

The  past  several  years  have  seen  the  formulation 
of  numerous  mathemat leal  models  for  predicting  the 
reliability  and  error  content  of  a  software 
system.  These  predictive  tools  were  needed  to 
permit  better  tracking  of  software  developments  by 
providing  a  software  manager  with  more  detailed 
information  regarding  the  status  of  his 

development.  Models  formulated  have  ranged  from 
the  earliest  ones  that  assumed  an  exponential 
distribution  of  time  to  detect  errors,  such  as 
Shoonan's,  to  more  complicated  ones  such  as  Musa's 
execut ion-t ime  model.  These  models  have  been 
experimentally  tested  against  available  software 
error  data  by  the  model  developers  and,  whenever 
possible,  comparisons  of  various  models  have  been 
performed  by  model  developers  in  order  to 

demonstrate  the  applicability  of  each  model. 
However,  criticism  of  this  initial  model  testing 
has  arisen  due  to  the  limited  quantity  and  nature 
of  the  software  error  data  available  to  model 
developers.  This  is  especially  true  with  respect 
to  DOD  software  development  projects,  with  their 
complex  and  unique,  one-time  applications  in  such 
areas  as  command  and  control  and  avionics. 

To  help  develop  a  better  knowledge  about  the 
applicability  of  these  software  reliability 
models,  and  to  obtain  better  confidence  in  their 
predictions,  RADC  has  been  analyzing  the 
predictions  of  several  software  reliability  models 
against  error  data  obtained  during  the  formalized 
testing  of  several  large  DOD  and  NASA  software 
developments.  This  paper  will  present  the  results 


of  the  RADC  study  and  discuss  the  problems  of  data 
availability  in  performing  this  study.  First,  the 
models  used  will  briefly  be  described.  Next  will 
follow  brief  descriptions  of  the  four  software 
development  projects  used  in  the  study.  Model 
predictions  will  be  presented  and  analyzed  with 
some  conclusions  offered  as  to  nodel 
applicability*  Finally,  the  problems  of  data 
availability  will  be  addressed. 

Model  Discussion 


The  initial  goal  of  this  in-house  study  was  to 
analyze  as  many  software  reliability  models  as 
possible,  using  as  many  software  error  data  sets 
as  possible.  As  the  study  evolved,  it  became  very 
apparent  that  the  limiting  factor  was  data 
availability.  Many  models  that  would  have  heen 
desirable  to  consider,  such  as  the  Shooman's 
exponential  and  Musa's  models,  were  eliminated 
because  the  data  available  to  RADC,  which 
consisted  of  data  extracted  from  Software  Problem 
Reports  (SPRs)  that  were  filled  out  by  the  various 
contractors  during  the  formal  test  phases  whenever 
a  software  error  was  detected,  was  lacking  in  some 
of  the  needed  categories  such  as  CPU  data.  The 
models  finally  examined  were  the  Jelinski-Moranda 
De-Eutrophication,  Schick-Wolverton,  Modified 
Schick-Wolverton,  Jelinski-Moranda  Geometric 
De-Eutrophication,  and  a  Modified  Geometric 
De-Eutrophication. 


Predictions  from  these  five  models  were  first 
analyzed  against  data  from  a  large  DOD  command  and 
control  project  on  a  total  project  basis  using, 
Maximum  Likelihood  (*n,F.)  estimates  for  model 
parameters.*  Next,  software  error  data  from  three 
additional  DOD  projects  were  analyzed  against  the 
three  non-geometric  models,  since  the  three 
non-geometric  models  predicted  the  number  of 
initial  errors.  Both  MLF  and  Least  Squares 
estimates  for  model  parameters  were  used,  and 
model  predictions  were  obtained  on  a  total  project 
basis  and  also  on  an  error  criticality  basis/* 
More  recently,  model  predictions  on  a  functional 
subsystem  basis  were  obtained  and  analyzed. 


The  assumptions  of  the  three  non-geomet ric  models 
used  are  given  in  Table  1,  while  the  mathematical 
equations  for  the  hazard  function  describing  each 
of  the  three  models  are  given  in  Table  2.  For  a 
more  detailed  description  of  the  models  the  reader 
should  consult  Reference  l. 
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Table  i.  Model  Assumptions 
Assumpt ions 


obtained,  is  given.  To  maintain  anonymity  the 
projects  are  referred  to  as  Projects  1,  2,  3  and 

4. 


Jelinski-  1. 
Mo  rand a 


2. 

3. 

Schick-  1 . 
tail  vert  on 


The  amount  of  debugging  tine  between 
error  occurrences  has  an  exponential 
distribution  with  an  error  occurrence 
rate  proportional  to  the  mnber  of 
remaining  errors. 

F.neb  error  discovered  is  immediately 
removed ,  thus  decreasing  the  total 
number  of  errors  by  one. 

The  failure  rate  between  errors  is 
constant • 

The  amount  of  debugging  time  between 
error  occurrences  has  a  Rayleigh 
distribution. 


2.  The  error  rate  is  proportional  to  the 
nimher  of  remaining  errors  and  the 
time  spent  in  d e bug g i ng . 

3.  Facli  error  is  immediately  removed, 
thus  reducing  the  nunber  of  errors 
by  one. 

Modified  Sane  as  Sch ick-Vo 1 vc r ton  except  for: 

Schick-  2.  The  error  discovery  rate  is  a 

U'olverton  constant  during  a  time  interval  and 
is  proportional  to  the  number  of 
errors  remaining,  the  total  tine 
previously  spent  in  debugging,  and  an 
"averaged"  error  search  tine  during 
the  current  debug  interval. 


Model 


Tabic  2.  Model  Equations 
Hazard  Function 


Jelinski- 

Mo rand a 

Schick- 

Volverton 

Modified 

Schick- 

Woive rton 


zUp  *  01 N  - 

z(t.)  -  «<|N  -  n1_()tl 

z(t.)  -  (ZSIN  -  -  t./21 


where:  (I  is  che  failure  rate 

N  is  the  number  of  initial  errors 

is  the  cumulative  nunber  of  errors  found 
through  the  i-th  debugging  interval 
T  is  the  cumulative  time  spent  debugging 
1  through  the  i-th  error 

Notes:  hazard  Function  is  the  probability  of  an 
error  occuring  in  a  given  infinitesimal  time 
interval  given  that  no  error  has  occurred 
previously  to  that  interval.  The  hazard  function 
is  related  to  the  reliability  R(t)  and  mean  time 
to  failure  “TTF  by  the  following: 


expl lz(  s)ds) 

(1) 

t  )dt 

0 

(2) 

Project  Discussion 


Project  1 

This  project  was  a  real-time  control  system  for  a 
land-based  radar  system  written  mostly  in 
J0VIAL/J3,  with  the  Executive  and  a  few  of  the 
application  modules  written  in  Assembly.^  The 
data  obtained  for  this  project  was  from  the  formal 
testing  of  all  the  project  software,  including  the 
Executive.  Formal  testing  began  with  guild 
Integration,  where  the  modules  were  tested  with 
the  system  executive  and  system  data  base.  Hpon 
successful  completion  of  this  testing  a  build  was 
formed,  which  then  was  passed  on  to  Acceptance 
testing.  After  completion  of  Acceptance  testing 
the  build  entered  Operational  demonstration,  where 
a  series  of  increasingly  demanding  mission 
profiles  designed  to  exercise  the  system  and 
evaluate  its  response  were  run.  It  is  important  to 
note  that  this  system  was  a  demonstration  model, 
i.e.  it  was  only  designed  to  demonstrate  that  a 
system  meeting  the  user  requirements  could  be 
designed  and  built.  It  was  never  intended  to 
become  operational. 

Project  l  software  was  developed  using,  both 
top-down  and  bottom-up  techniques  and  in  a  modular 
fashion.  For  example,  module  specifications  were 
derived  from  the  top-down  starting  with  the 
system-level  requirements.  System  integration  was 
performed  in  incremental  builds  to  check  the 
interrelationships  among  the  software  modules  and 
with  the  hardware.  Dummy  modules  ana  drivers  were 
used  for  testing  those  modules  not  part  of  a  given 
build. 

Project  2 

Project  2  was  a  command  and  control  system  written 
in  JOVIAL/J4.12  The  software  was  developed  in  a 
series  of  modifications  with  each  modification 
governed  by  a  separate  set  of  requirements  and 
developed  independently.  The  software  was 
developed  functionally,  i.e.  the  project  was 
divided  into  work  units  responsible  for  different 
functions.  Testing  of  each  modification  was 
conducted  in  five  phases  starting  with  Development 
testing  by  the  development  personnel  to 
demonstrate  specific  function.il  capabilities,  test 
data  extremes,  etc.  Formal  testing  began  after 
Development  testing  with  Validation  and  Acceptance 
testing.  Validation  testing  was  performed  hy  an 
Independent  test  group  at  the  subsystem  level  and 
demonstrated  the  approved  software  performance  and 
requirements.  Acceptance  testing  ran  a  subset  of 
the  Validation  tests  to  demonstrate  specific 
requirements.  After  this  testing  the  software 
underwent  final  Integration  testing  by  an 
independent  group.  This  Integration  testing 
demonstrated  that  the  applications  software 
correctly  interfaced  with  the  operating  system  and 
system  support  software.  Data  used  in  this  study 
was  fron  the  formal  testing  of  the  Project  2 
applications  software  only. 


In  this  section  a  description  of  each  of  the  four 
projects,  frixn  which  the  error  data  analyzed  was 
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Project  3 


Tabic  3.  Project  Characteristics 


Project  3  was  a  large  command  and  control  project 
written  in  .I0V1 AL/J4 .  12  Structurally  anti 
procedural  ly ,  Project  3  was  developed  similarly  to 
Project  2.  However,  the  Project  3  software 
underwent  an  extra  Operational  Demonstration  test 
phase  in  addition  to  the  five  test  phases  the 
Project  2  software  underwent.  The  Operational 
Demonstration  testing,  which  began  after 
Integration  testing,  was  designed  to  demonstrate 
the  software  in  an  operational  environment  using 
an  operational  tine  line  and  operational  data.  The 
data  obtained  from  this  project  was  from  the  four 
formal  test  phases  (Validation,  Acceptance, 
Integration,  Operational  Demonstration)  of  the 
applications  software. 

Project  4 

This  project  was  a  large  avionics  software 
application  program  written  in  J0VIAL/J3H  and 
Assembly. ^  The  software  consisted  of  five  major 
functional  areas  in  the  operational  software  ami 
two  in  the  simulation  software.  Testing  of  this 
software  began  with  Module  Verification  testing 
performed  by  the  developer  of  each  module.  Once 
this  testing  was  finished,  the  module  was  released 
for  formal  testing.  Formal  testing  began  with 
Inter-Module  Computability  testing  where  the 
software  was  checked  against  its  functional 
requirements  as  a  total  unit,  and  which  was  done 
by  the  software  development  group.  After 
completion  of  this  testing  the  software  system  was 
given  to  an  independent  system  test  group  for 
Systems  Validation  testing,  where  acceptance 
testing  for  quality  control  purposes  was 
performed.  The  data  obtained  for  this  project  was 
from  the  two  formal  test  phases  and  is  from  both 
the  operational  and  simulation  software  for  the 
first  two  versions  (called  blocks)  of  the  software 
system . 


Project 

1 

2 

3 

4 

Language 

Used 

J0VIAL/J3 
As  sen b l y 

J0V1AL/J4 

JOVIAL/ J4 

JoV I 4L/J3K 
Assembly 

Size 
( Li nes 
of  Code 
or  No. 
of  Mach. 
Inst . ) 

8f>7SU(  J ) 
499uO(A) 

Sf>931(J) 

1  l  53 4 6 ( J ) 

4 0640 (J ) 
84065(A) 

No .  of 
Modules 

109 

173 

249 

Operate 

Mode 

Real-Time 

lintch 

Batch 

Real-T i me 

Formal 

Build  In¬ 

In  Cep  rii- 

1 ntegra- 

Inter- 

Testing 

tegra  t  ion 

t  ion 

t  ion 

Modu 1 o 

Accept¬ 

Va lida- 

Va 1 i dn- 

Conp.'ita- 

ance 

t  ion 

t  inn 

b  i  1  i  t  v 

Opera- 

Accept¬ 

Accept¬ 

Systems 

t ionnl 

Demo n- 

stra- 

t  ion 

ance 

ance 

Ope ra- 
t  ion.il 
Deni  a— 

st  rat¬ 
ion 

Va 1 j  rfa- 
t  ion 

date  that  a  subsystem  began 

testing,  as 

two  date: 

for  model  analysis  to  see  if  the  start  date  ut 
model  prediction  affected  model  predictability. 
All  results  will  be  given  for  both  dates. 
Finally,  operational  data  was  available  for 
Project  3  only.  Thus  all  remarks  made  concerning 
model  predictability  for  Projects  1,  2  and  4  are 
based  on  conversations  with  project  developers  and 
on  relative  comparisons. 


Table  3  contains  a  summary  of  the  four  projects. 

Model  Results 

The  data  used  in  this  study  was  from  SPRs  and  was 
restricted  to  those  errors  that  resulted  in  a 
change  to  the  software  itself.  The  reason  for  this 
was  that  although  unquest ionably  documentation 
errors  are  important  and  should  be  considered 
along  with  the  other  types  of  software  errors,  a 
confusion  in  the  interpretation  of  the  Project  2 
and  3  non-software  errors  forced  the  arbitrary 
decision  to  eliminate  all  non-software  errors, 
including  document at  ion  errors,  to  avoid  confusion 
in  interpreting  model  predictions.  Opening  dates 
were  used  instead  of  closing  dates  on  the  SPRs 
because  of  biases  introduced  in  closing  SPRs  due 
to  prioritizing  and  schedule  demands. 

The  data  was  organized  into  errors  per  day  and 
errors  per  week  to  see  how  the  use  of  different 
time  frames  for  the  data  affected  model 
predictions.  Since  the  data  was  subdivided 
functionally  for  all  four  projects,  it  was  decided 
to  use  the  date  of  the  first  SPR  and  the  latest 


All  results  will  be  presented  in  terms  of  tie 
predicted  number  of  remaining  errors,  i.e.  the 
number  of  predicted  initial  errors  minus  the 
number  of  errors  found  to  date.  For  notation 
purposes,  the  models  will  be  denoted  as  follows: 

Je  1  i  n ski -Mo rand a  (“axinum  Likelihood):  J” 
del inski-Mor.mda  (Least  Squares):  1JM 
Sc  hick-Wul  vert  on  (Maximum  Likelihood):  S\> 
Schick-Wolvorton  (Least  Squares):  LSI* 

Modified  Sch i ck-L'ol ve rt on :  MODS1*’ 

Table  4  presents  a  sunmary  of  the  total  project 
ami  summed  criticality  and  subsystem  predictions 
for  the  five  models  for  Projects  1-4.  Note  that 
— "  indicates  nonconvergence  of  the  estivate 
equations  lor  that  particular  model.  Msu,  the 
numbers  in  *'[ ]"  are  the  number  of  errors  found  up 
through  the  end  of  formal  testing,  aim  the  number 
in  M( )"  for  Project  3  is  the  actual  number  of 
remaining  errors. 
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T.il'U- 

4.  Tot 

al  vs 

.  Summed  Pr 

ed  it  t  ions 

Total 

System 

Cr  i  t  ic.il  i  ty 

Subsystem 

St.irt 

Pro 

vf 

Prod 

Pred 

i\i 

Pate 

‘•ode  l 

Day 

i-t'i’ii 

Hay 

Week 

Ihiy 

Week 

l 

1-02-71 

J N 

724 

494 

802 

754 

— 

— 

1 1*33) 

SW 

2  b 

154 

37 

1  lb 

— 

— 

NO  HSU 

14 

1  1 

1  1 

12 

43 

38 

UN 

54 

.89 

62  7 

380 

1326 

678 

LSW 

8821 

inn 

1647  7 

4694 

26995 

1  1793 

3-011-71 

JN 

499 

4,80 

535 

506 

— 

— 

SL 

14 

10,8 

4  1 

73 

— 

— 

NODSW 

8 

7 

8 

A 

23 

20 

LIN 

51 

57 

832 

397 

1524 

620 

LSW 

8814 

984 

16979 

4824 

28028 

1  1828 

2 

10-14-71 

1212] 

JI5 

SW 

- — 

_ 

_ 

_ _ 

_ 

— — 

M0DSW 

82 

75 

297 

169 

— 

— 

lj:i 

140 

37 

372 

188 

633 

373 

LSW 

1998 

750 

3891 

1  677 

5321 

2594 

1-17-72 

JN 

72 

44 

— 

— 

— 

— 

[ISO] 

SW 

39 

54 

— 

— 

— 

— 

Ntmsw 

5 

3 

15 

9 

— 

— 

LJN 

99 

46 

211 

140 

483 

271 

LSW 

1315 

330 

2237 

560 

4224 

1902 

3 

ft-Ol-73 

jr 

1  288 

1179 

— 

— 

— 

— 

[2191  ] 

SW 

955 

1289 

— 

— 

— 

— 

(  198) 

NODSW 

42 

25 

504 

415 

65 

43 

LJN 

88 

23 

390 

143 

814 

419 

LSW 

2155 

685 

2983 

2023 

11140 

3550 

7-28-73 

JN 

233 

198 

— 

— 

— 

— 

[1307] 

SW 

193 

220 

— 

— 

— 

— 

NO  USI/ 

10 

1 

55 

35 

52 

52 

UN 

81 

27 

338 

139 

645 

340 

LSW 

1322 

505 

2006 

1  166 

5966 

2355 

4 

5-22-73 

JN 

— 

— 

— 

— 

l 1877] 

SW 

— 

— 

— 

— 

Nonsw 

7830 

6328 

— 

— 

LJN 

174 

74 

2710 

1213 

LSW 

9482 

2202 

28629 

12379 

8-27-74 

JN 

650 

591 

— 

— 

1 1  509  ] 

SW 

185 

625 

— 

— 

M0DSW 

27 

22 

153 

473 

LJN 

1  16 

63 

739 

488 

LSW 

2  34 

1103 

10682 

3250 

Total  Project  Comparison 

As  can  he  seen  fron  Table  4,  there  is  a 
considerable  difference  in  predictability  when  the 
different  start  dates  for  model  prediction  nr*’ 
vised,  and  this  difference  shows  up  for  all  four 
projects.  For  example,  for  Projects  2  and  4  the  J!! 
and  SW  models  fail  to  converge  using  the  daf** 
testing  started  as  the  start  date  for  model 
predictions,  while  these  two  models  did  converge 
when  the  date  all  models  were  ready  for  testing 
was  used  as  the  start  date.  Notice  the 
(significant  drop  in  remaining  error  predictions  in 
many  instances  between  the  use  of  the  two 
different  start  dates.  For  instance,  for  Project  4 
there  is  a  factor  of  IOC)  drop  in  the  renal ning 


error  predictions  for  the  “OliSU  model  when  8-27-74 
is  vised,  as  opposed  to  5-22-73,  ts  t  Ik*  start  date. 
The  same  is  true  for  the  NooSW  remaining  error 
predictions  for  Project  2,  although  here  there  is 
only  4 1  factor  of  10  difference.  However,  there  art- 
instances,  part  icul  arl  y  for  the  LIN  and  LSW 
models,  when  there  is  no  significant  difierenc* 
between  the  retraining  error  predictions  using  the 
two  different  st.irt  dates.  Thus  it  would  appear 
from  the  total  project  predictions  that  the 
difterence  in  start  dates  affects  t  fie  tlaximun 
Likelihood  parameter  estimations  more  than  i  fie 
Least  Squares  parameter  est f ma t ions . 

The  daily  versus  weekly  model  predictions  do  not 
offer  as  convincing,  a  pattern.  In  most  cases  there 
does  not  appear  t«>  be  a  significant  difference 
between  using  the  day  and  wee1'  as  the  ti**e 
interval,  although  t ho  LIN  and  hSh  models  do  in 
sone  cases  show  some  nontrivial  differences.  For 
example,  for  Project  2  usi  •  the  10-14-71  start 
date  the  remaining  error  predictions  for  the  LJ" 
model  were  140  using  the  day  anti  5  7  using  the 
week.  The  LSW  model  predicted  1315  using  the  day 
and  330  using  the  week  for  Project  2  with  a 
1-17-72  start  date.  However,  in  some  vses  t he 
"weekly”  prediction  was  greater  than  the  "daily" 
prediction.  lor  example,  t  he  L.1M  model  predicted 
SI  remaining  errors  using  the  day  .»<.<!  H?  error** 
vising  the  week  for  Project  I  with  a  3-06-73  start 
date. 

On  a  project  versus  project  basis,  clearly  t  fit  LSW 
model  predicts  much  higher  values  than  the  other 
models,  while  the  N0..SW  model  generally  predicts 
lower  values  tfi.it  the  ocher  models.  I  ron  t  he 
actu.ii  remaining  error  count  for  Project  3  it  is 
clear  that  those  using  7-2S-73  are  .much  more 
accurate  than  those  using  6-01-73,  with  sone 
predictions  being  almost  "too  goixi".  Since  Project 
2  and  3  art  both  command  and  control  projects,  one 
would  expect  that  the  same  pattern  of  model 
predictability  holds  for  both  projects.  Fron  Table 
4  it  does  appear  that  using  1-17-72  as  the  start 
date  gives  realistic  predictions  for  all  models 
except  the  LSI.  model,  while  for  the  10-14-71  start 
date?  only  the  NOPSW  and  LJN  models  give  reasonable 
predictions.  One  would  hope  that  the  actual  number 
of  remaining  errors  was  closer  to  the  ”1-17-72" 
predictions  than  the  "10-14-71”  predictions. 
Project  1,  being  a  variation  of  a  command  and 
control  project,  would  also  hopefully  give  the 
same  pattern  of  model  predictability,  and  from 
Table  4  it  appears  that  this  is  so.  For  Project 
4,  since  it  is  an  avionics  development  and  thus 
significantly  different  from  the  other  three 
projects,  one  would  be  interested  in  any 
differences  in  model  predictability.  Fron  Table  4, 
one  can  note  that  the  same  general  pattern  .appears 
with  respect  to  the  difference  between  the  two 
start  day  predictions.  However,  it  is  interesting 
to  note  that  the  Project  4  predictions  seen  to  be 
higher  overall  than  those  for  the  other  three 
projects.  The  notable  exception  is  the  LSW  model 
predictions  for  the  8-27-74  start  date.  This  does 
suggest  (at  least  for  the  limited  data  available) 
that  the  Naximun  Likelihood  parameter  estimates 
might  not  he  as  accurate  for  avionics  software 
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predictions  as  the  Least  Squares  estimates,  while 
just  the  reverse  holds  for  the  Project  1-1 
predictions  (when  the  models  conversed), 
obviously  more  testing  is  needed  to  verify  this 
hypothesis. 

Criticality  and  Subsystem  Predictions 

Note  iron  Table  4  that  no  criticality  data  was 
available  for  Project  4.  In  order  to  understand 
the  criticality  *»redict  ions,  it  is  important  to 
note  that  the  categori zat ion  of  each  error  was 
made  (by  the  developer)  on  the  basis  of  the  deg ree 
it  was  felt  that  error  would  impede  t ho  execution 
of  a  test  case  or  prohibit  deronst rat  ion  of  a 
requi renent •  As  can  be  seen  from  Table  4,  just  as 
for  the  total  project  basis  the  nodel  predictions 
are  generally  lower  for  the  case  when  the  date  all 
nodules  are  ready  for  test  in*;  is  used  as  the  start 
date  than  when  the  start  of  testing  was  used  as 
tie  start  date.  The  main  exception  is  the  LSV 
nodel  predictions  for  Project  l.  There  is  no 
three  project  pattern  between  the  "day1*  and  "week" 
predictions.  For  example,  for  Project  1  with  a 
1-2-73  start  date,  the  criticality  predictions  for 
the  J:i,  :01)SU,  U'1  and  LSW  models  tor  the  "week" 
are  less  than  or  equal  to  the  "day"  predictions. 
Powever,  the  SW  model  predictions  are  higher  for 
the  "week"  than  lor  the  "day".  Note,  however,  that 
except  for  the  Project  •  i SW  predictions  the 
"week"  predictions  for  Projects  2  and  3  were  lower 
than  the  "day"  predictions. 

From  Table  4  it  also  appears  that  for  the 
subsystem  predictions,  as  was  the  case  for  the 
criticality  and  total  project  predictions,  us  inf; 
the  date  when  all  nodules  art*  ready  for  testing, 
gives  generally  more  realistic  predictions  than 
using  the  date  testing,  actually  starts.  Also,  for 
the  subsystem  predictions  the  "day"  versus  "week" 
predictions  showed  a  fairly  consistent  pattern  for 
Projects  1-4  of  the  "week"  predictions  being  lower 
than  the  "day"  predictions,  as  was  the  case  for 
tie  Project  1-3  criticality  predictions,  with  the 
exception  of  Project  4  using  the  <1-27-74  start 
date. 

Summarizing.  Table  4  we  see  that  in  most  cases  the 
"sunned"  predictions  are  greater  than  the  "total" 
predictions.  Note,  however,  that  for  Project  1  the 
SW  criticality  predictions  total  less  than  the 
"lotal"  predictions  for  both  start  dates.  Note 
also  that  the  difference  between  the  "sunneo"  and 
"total"  predictions  is  less  for  Project  1  than  for 
tue  other  projects.  Overall,  the  NOdSW  model 
appeared  to  have  the  most  consistent  predictions, 
with  the  U::  and  LSW  models  appearing  to  have  the 
least  consistent  predictions.  It  appears  that 
generally  the  criticality  predictions  are  closer 
to  the  "total"  predictions  than  are  the  subsystem 
predictions.  However,  this  is  not  totally 
consistent,  since  for  Project  3  the  subsystem 
predictions  are  closer  to  the  total  than  the 
criticality  predictions  for  the  !K)I)SW  model.  Thus 
one  can  not  draw  any  dominant  patterns  from  this 
"total"  versus  "summed"  analysis. 


Cone  1  us i ons 

In  presenting  these  results,  attempts  have  been 
made  to  draw  general  conclusions  about  nodel 
predictions.  Since  no  totally  consistent  patterns 
have  evolved  in  most  cases,  general  conclusions 
ar«*  difficult.  "owever,  since  in  most  cases 
Project  1  appears  to  deviate  from  patterns  that 
•ire  uoni  iant  for  the  other  projects,  and  since 
Project  l  was  never  intended  to  become  operational 
wuile  the  other  three  were,  one  could  eliminate 
Project  1  anu  make  conclusions  on  the  basis  of 
model  predictions  tor  the  other  three  projects.  In 
formulating  die  following  conclusions,  1  have  done 
this  only  with  respect  to  the  subsystem  and 
criticality  predictions,  since  the  pattern  was  so 
doni  rant  for  the  Project  2-4  predictions. 

Before  stating  any  conclusions,  however,  a  few 
words  are  needed  about  the  nonconvergence  of  the 
dh  and  SW  node  Is  (and  in  some  cases  the  h()hS‘.‘ 
model).  It  had  been  hypothesized  in  [I]  that  the 
reason  for  this  nonconvergence  was  t  fie  size  of  the 
data  the  models  were  applied  against.  The  sizes  of 
some  of  the  data,  particularly  Project  2,  somewhat 
negates  this  hypothesis,  however,  there  does 
appear  to  he  a  pattern  of  nonconvergence  for  those 
data  sets  where  the  error  density,  i.e.  t lie  number 
of  errors  found  per  unit  time,  is  sparse  and 
uneven.  This  is  especially  true  for  Project  4. 
Thus  it  does  appear  that  a  significant  factor  in 
determining  the  convergence  of  the  "axinun 
Likelihood  estimates,  since  t  lie  Least  Squares 
estimates  always  converged,  is  the  rate  of  error 
detection.  This  would  seen  reasonable,  since  all 
three  basic  models  implicitly  assume  a  constant 
level  of  testing.  A  sparse  uneven  error  detection 
density  would  certainly  tend  to  negate  this 
assumption.  However,  more  research  is  needed  to 
verify  this. 

From  the  above  analysis,  then,  the  following, 
general  conclusions  can  be  drawn: 

1.  Clearly  it  is  better  to  use  the  date  all 
modules  are  ready  for  testing  to  begin  model 
predictions  than  the  date  testing  actually  begins. 
This  pattern  was  almost  universally  consistent 
among  all  the  predictions. 

2.  For  "command  and  control"  projects  such  as 
projects  2  and  3,  it  appears  that  the  ‘'axir.un 
Likelihood  estimates,  when  they  converge,  give 
more  reasonable  and  accurate  estimates  than  the 
Least  Squares  estimates. 

3.  For  "avionics"  projects  sue)}  as  Project  4,  it 
appears  that  the  Least  Squares  estimates  are  more 
reasonable  and  accurate  than  the  Laxinun 
Likelihood  estimates.  However,  this  conclusion  is 
somewhat  suspect  due  to  the  nonconvergence 
problems  for  the  Project  4  data. 

4.  For  the  criticality  and  subsystem  predictions, 
it  appears  that  using  the  week  as  the  time 
interval  gives  more  reasonable  predictions  than 
using  the  day  as  the  time  interval. 


214 


Oat  a  Availability 

At  this  point  the  problem  of  data  availability 
needs  to  be  stressed.  All  of  the  conclusions  made 
have  been  based  upon  work  on  data  available  to 
RADC.  This  data  has  some  nice  features,  such  as 
the  availability  in  must  cases  of  criticality 
designations  and  the  ability  to  restructure  the 
data  based  upon  various  classifications  such  ns 
funct ional  subsystems.  However,  the  data  does  have 
sone  serious  limitations.  First,  the  data  is 
historical  in  nature.  Thus  much  of  the  information 
that  one  needs  to  perform  a  complete  analysis  of 
different  software  error  prediction  models, 
particularly  CPU  tine  data,  was  either  totally 
lacking  or  of  such  a  cumulative  nature  as  to  be 
unusahle.  Second,  the  definitions  among  different 
projects  varied.  Tor  example,  what  one  project 
called  integration  testing  was  not  what  another 
project  meant  by  integration  testing.  How  the 
various  error  categories  were  defined,  as 
mentioned  earlier  for  the  Project  2  and  3  data, 
differed  anon;*  the  four  projects.  Also,  the  four 
projects  were  not  consistent  in  what  information 
could  be  provided.  For  example,  no  criticality 
data  was  available  for  Project  4.  On  another 
project  from  which  error  data  was  obtained  but  not 
used  in  this  study,  no  information  was  available 
as  to  the  opening,  date  of  the  SPRS;  only  the 
closing  date  was  available, 

With  the  obvious  shortcomings  of  the  data 
available  to  KADC,  it  is  more  than  likely  that  as 
more  data  does  become  available  the  conclusions 
drawn  in  this  analysis  nnv  have  to  he  slightly  or 
totally  altered.  In  addition,  several  patterns 
were  noticed  in  analysis  of  the  RAPC  data,  such  as 
the  problem  of  model  nonconvergence  and  the 
greater  accuracy  of  the  MLR  or  beast  Squares 
estimates  under  certain  conditions  and  for  certain 
’’types"  of  projects,  that  need  more  research. 
However,  to  perform  this  additional  research 
requires  both  additional  and  more  complete  data. 
More  data  is  also  needed  to  perform  more 
statistically  valid  tests  for  interpreting  model 
accuracy • 

Thus  it  becomes  very  clear  that  the  limitations  of 
the  data  available  to  RAPC,  and  to  most  attempts 
to  validate  software  error  prediction  models, 
requires  more  complete,  and  simply  more,  data. 
This  implies  several  issues,  though.  First, 
precise  definitions  need  to  be  made  of  the  data 
elements  that  should  be  collected  on  software 
projects  so  that  the  maximum  benefit  can  be 
obtained  in  terms  of  software  error  model 
analysis.  In  the  case  of  POP,  this  translates  Into 
the  nerd  for  devel oj^jent  of  appropriate  Data  Item 
Descriptions  (DIDs)  that  can  be  used  in  software 
procurements  to  require  a  contractor  to  collect 
the  desired  software  error  data  in  the  desired 
formats.  Second,  data  collection  on  on-going 
software  developments  must  be  initiated  to  provide 
complete  and  accurate  data  of  the  type  necessary 
for  model  analysis.  This  data  collection  is  also 
necessary  to  analyze  projects  that  use,  for 
example,  modern  programming  practices  to  perform 
analysis  of  the  benefits  of  such  practices  in  a 


formalized  statistical  manner. 

bast,  and  most  important  from  an  RAPC  viewpoint, 
there  is  the  need  for  a  centralized  data  »>ase 
uhere  researchers  can  obtain  software  data  of  the 
form  necessary  for  his  particular  research.  This 
data  base  must  permit  the  specification  of  various 
categories  and  formats  of  software  data  on  either 
a  single  project  or  cross  project  basis.  The  most 
difficult  problem  that  wa»  faced  in  the 
performance  of  this  RAPC  study  was  the*  large  time 
required  to  take  the  data  from  its  original  card 
image  tape  form  and  extract  the  necessary 
information  in  the  required  form,  for  input  Into 
the  various  models.  A  computerized  data  base  for 
this  type  of  information  would  have  greatly 
expedited  model  analysis.  Also,  a  computer i zed 
data  base  would  facilitate  storage  of  more  data 
sets,  and  would  help  in  specifying  unique  and 
consistent  descriptors  of  the  various  data 
elements  so  that  the  "apples  and  oranges"  problem 
can  be  eliminated.  Finally,  this  data  base  will 
facilitate  the  use  of  software  data  for  purposes 
other  than  research,  most  notably  in  tracking,  and 
managing  large  software  development  projects.  pAOC 
is  currently  developing  a  pilot  Data  Analysis 
Center  for  Software  (PACS)  that  will  service  all 
kinds  of  software  data  and  which  will  address  the 
problems  described  above.  Clearly  such  a  center 
is  needed  for  both  research  in  software  error 
prediction,  and  in  providing  bettor  methods  for 
monitoring,  software  developments. 
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